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Abstract: In the United States, local measures of racial and ethnic diversity are robustly associated with 
lower birth rates. A one standard deviation decrease in racial concentration (having people of many different 
races nearby) or increase in racial isolation (being from a numerically smaller race in that area) is associated 
with 0.064 and 0.044 fewer children, respectively, after controlling for many other drivers of birth rates. 
Racial isolation effects hold within an area and year, suggesting that they are not just proxies for omitted 
local characteristics. This pattern holds across racial groups, is present in different vintages of the US census 
data (including before the Civil War), and holds internationally. Diversity is associated with lower marriage 
rates and marrying later. These patterns are related to homophily (the tendency to marry people of the same 
race), as the effects are stronger in races that intermarry less and vary with sex differences in intermarriage. 
The rise in racial diversity in the US since 1970 explains 44% of the decline in birth rates during that period, 
and 89% of the drop since 2006. 


Contact at umit.gurun@utdallas.edu and david.solomon@bc.edu, respectively. We would like to thank Bill Cready, 
Eric So, James Weston, and seminar participants at Boston College and the University of Western Australia for helpful 
comments and suggestions. All remaining errors are our own. 


Since the middle of the 20" Century, the United States has experienced two major demographic 
changes. The first is a large increase in racial diversity. This is dramatic at the national level, but varies 
considerably at the local level. This trend arose both through large increases in immigration, and policies 
designed to reduce racial segregation. The second major change is a considerable decline in birth rates. 
Births per woman (total fertility rate, or TFR) have fallen by more than half, from approximately 3.6 total 
births per women in 1960, down to an all-time-low of 1.64 in 2020, which falls below the replacement birth 
rate needed to sustain the population. Declining fertility during a period of economic growth is prima facie 
surprising (Becker 1960). This is especially so for the recent declines since 2007, which are puzzling and 
hard to explain quantitatively (Kearney, Levin, Pardue 2022). 

We pose a simple question which, to our understanding, has not been previously considered — are 
these two facts related? We argue that they are. We present evidence that birth rates are robustly lower in 
areas of greater local racial and ethnic diversity, after controlling for a wide array of potential confounding 
variables. We consider two slightly different aspects of racial variation within a community. The first, racial 
concentration, is a Herfindahl index of racial groups within a local area (as in Putnam 2007). Intuitively, 
this captures the difference between an area with many groups in small proportions versus mostly one major 
group, and maps to what is often just termed “diversity”. The second measure is a consequence of racial 
diversity at the individual level, which we call “racial isolation" — the race share of the population for that 
person. Intuitively, this captures how many people in your area are “like you" in racial and ethnic terms. 

There are strong reasons from the economics of marriage to predict that higher diversity may result 
in reduced birth rates. As the number of people of different races in each area has increased, people have 
fewer encounters with others of their own race. Various studies document that people on average have a 
preference for homophily — they prefer to marry those with similar characteristics, particularly people of 
the same race, even express such preferences in using reproductive technologies (e.g., Bedi 2000, Daniels 
and Heidt-Forsythe 2012, Hwang 2012, and many others). If the number of potential same-race partners 


drops in an area, then either one incurs higher search costs to find a good match, or the quality of matches 


decreases, or both. While the evidence for homophily is large, the possibility that this may have implications 
that link the rise in diversity and the decline in birth rates does not seem to have been considered. 

Using US Census and American Community Survey data since 1850, we find that both race 
Herfindahl and race share variables are robustly associated with higher birth rates. That is, being in a more 
racially concentrated area, and being part of a larger group within that area, are both associated with having 
more children, irrespective of the level of controls. With our full set of controls, a one standard deviation 
increase in race share predicts the average woman aged 18-40 has 0.064 more children, with ‘statistics 
generally above 5 when clustered by state and year. For race Herfindahl, in our preferred specification a 
one standard deviation increase is associated with 0.044 more children. We construct these variables at the 
finest geographic level available- city, then county, then detailed metro area. As our baseline race definition, 
we use the census’ broad racial classification plus a separate category for Hispanic/Latino ethnicity. 

Our use of granular panel data combined with high dimensional fixed effects and demographic 
controls considerably narrows the set of plausible explanations for our findings. For instance, the use of 
state-by-year fixed effects helps mitigate concerns that the negative link between fertility and diversity is 
attributable to general economic or cultural attributes of a state. The use of race-by-state and race-by-time 
fixed effects precludes many explanations about general racial differences within a state. We control 
explicitly for demographics (education, income, citizenship, employment, marital status), demographics 
interacted with state and year fixed effects, local area attributes (population, college fraction, income, 
fraction recently moved to the area, employment, age), and local area attributes interacted with year fixed 
effects. The effect is large and highly significant in every specification. At a minimum, the most obvious 
omitted variables and their associated explanations do not seem to be driving the whole effect. 

Because the Herfindahl measure is constructed as the sum of the squared fractions of each group, 
it is necessarily positively correlated with race share. The Herfindahl measure has the same value for 
everyone in an area that year, regardless of their race. However, even within a specific area and year, an 
individual's race share can vary further based on whether they belong to a more or less populous racial 


oup relative to the area's overall racial composition. This means that for race share, local-area controls 
group p 


can be replaced by an area-by-year fixed effect. If more diverse communities are bigger, richer, denser, have 
higher costs of raising children, or any other omitted factors, these are absorbed in this specification. Only 
variation within a local area and year is used, comparing larger and smaller groups within the area (after 
controlling for patterns in that race-by-state, race-by-year, etc.). 

Racial isolation effects on birth rates survive these area-by-year fixed effects, and their inclusion 
does not greatly change the parameter estimates. The effects of racial isolation are not due to any general 
omitted characteristics of the area that apply to all residents, but appear to capture an effect directly related 
to the size of one’s racial group. The consistent results across racial isolation and racial diversity measures 
suggest that they reflect a similar fundamental process, although directly testing this is challenging. While 
we do not explicitly argue for a causal interpretation of this relationship, we do not preclude one either. 
Diversity is both a cause and a consequence of various underlying factors, and even an exogenous change 
in diversity can have far-reaching effects on many aspects of an area that are hard to disentangle. To better 
understand the potential mechanisms and explanations behind the observed correlation, we employ targeted 
sub-tests designed to isolate specific factors and rule out alternative hypotheses, while acknowledging the 
inherent complexity of the relationship between diversity and birth rates. 

Importantly, the negative association between diversity and birth rates is present throughout U.S. 
history. It holds before the 21* century, before the Civil Rights Act, before World War 2, before the 20" 
Century, and, most surprisingly, before the Civil War. That is to say, racial isolation significantly lowers 
birth rates even in periods when slavery was legal, in the 1850 and 1860 Censuses, with a one standard 
deviation increase in race share being associated with 0.33 more children. The parsimonious explanation is 
that whatever is driving the effect must be broadly present in many eras. These results militate strongly 
against explanations that focus on specific events in the history of race relations, whether this be the end of 
segregation, white flight, reconstruction, lynching, the “Black Lives Matter” movement, or anything else. 

Second, the effects are present for many different racial groups. In our tightest specification, the 
effects of race share are positive and significant at the 10% level or better for seven out of ten groups 


(whites, blacks, native Americans, Chinese, Japanese, other, and two races — only other Asian/Pacific 


Islander, Hispanic and three or more races are insignificant). In this specification, whites show the fourth 
smallest magnitude effect. Across specifications, the most uniformly positive and significant results are for 
whites, native Americans, and two races. In other words, the effect is not limited to whites, nor to a single 
racial group, nor is it easily attributable to simple narratives about black/white race relations. A likely 
explanation ought to apply to people of many different races. 

Third, our findings are unlikely to be driven by selection effects related to mobility. For example, 
one possible explanation for our results is if younger people live in diverse areas but then move to racially 
homogenous areas when they have children. We redo the analysis for women who have not been 
geographically mobile - those living in the same state they were born in, or who have not moved in the 
years prior to the ACS survey. Across all subsamples, we observe consistent effects, suggesting that 
selective migration patterns are not the primary explanation for our results. 

Fourth, we find that the result is present outside the United States, using international census data 
for countries that record racial classifications. Racial diversity is strongly associated with lower birth rates 
in Africa (South Africa, Mozambique, Zimbabwe), and also in a small sample of UK data. Central and 
South American countries show mixed evidence, with some having strong positive effects (Ecuador, El 
Salvador), others having significant negative effects (Uruguay, Cuba), and a number being insignificant 
(Jamaica, Brazil). These results do not reveal an obvious pattern of what drives the variation in effects 
across countries, but suggest that explanations unique to the U.S. are unlikely to be sufficient. 

Next, we explore specific predictions of homophily. While homophily is a general pattern, it is 
unlikely that all races have the same revealed preference for same-race marriage at each point in time. If 
interracial marriage is more common for a given race and year, racial isolation should matter less for 
fertility. Second, within a race and year, interracial marriage rates also differ by sex, as women of a given 
race may “marry out" of their race at higher rates than men, or vice versa. This predicts different effects 
across the sexes — if women of a given race marry out more frequently, then racial isolation effects will bite 
more for men of that race than for women (as the men are more dependent on same-race women than those 


women are dependent on them). We find both predictions borne out in the data. More intermarriage reduces 


the effect of racial isolation on fertility, and more intermarriage by women relative to men reduces the racial 
isolation effect for women of that race relative to men of that race. This is strongly consistent with 
homophily playing an important role in our effects and is not easily explainable by other channels. 

To further test if our results are due to the difficulty of finding a desired partner, we examine other 
relationship outcomes. A one standard deviation increase in race share is associated with a 1.2 percentage 
point higher probability of a woman being currently married, a 1.2 percentage point higher probability of 
having ever being married, and a lower age of first marriage (by 2.3 months). It is somewhat negatively 
related to the probability of divorce, though the effects are weaker. Diversity effects do not appear to be 
limited to the narrow costs of raising children, but also to the difficulty of finding a martial partner. 

Another prediction of homophily is that if people have preferences for similar partners along other 
demographic dimensions, we ought to find demographic share effects for other variables. The evidence here 
is more mixed — we find robust positive effects for income decile share that are around half to two thirds as 
large as the race share effect, consistent with the homophily among income levels documented in in 
Greenwood et al (2014). However, other variables like education and age do not show the same effects. 

An alternative mechanism for our results is social trust. As Putnam (2007) describes: “/7/n more 
diverse settings, Americans distrust not merely people who do not look like them, but even people who do. 
.. Diversity seems to trigger not in-group/out-group division, but anomie or social isolation. ". Reduced 
social trust could contribute both to the difficulty of finding a partner, and choices over the number of 
children. While the predictions and metrics of trust are not as sharp as for homophily, we find that state 
level social trust measures are positively related to birth rates, and including them in the regressions reduces 
the race share effect by around a quarter to a third. This holds both when using Putnam (2007) measures of 
generalized trust from surveys, or when using more recent Facebook data, such as local volunteering rates. 

Further evidence that homophily is unlikely to be the entire explanation comes from the fact that 
both race Herfindahl and race share measures show separate effects when included in the same regression. 
This holds even when the Herfindahl index is calculated only among races other than that of the individual 


in question. This helps us rule out the possibility that the Herfindahl index is merely capturing non-linear 


effects of race share. Under homophily, the main question is the availability of potential same-race matches 
in one's vicinity, which is captured by the race share measure. It is unclear why, after conditioning on this, 
variation in the concentration of other races should matter, whereas under social trust, this aspect is 
important. That is, under homophily, if whites are 60% of the population in an area, this determines their 
chances of meeting and marrying each other, and it makes no difference whether the remaining 40% is a 
single race, or many races. Empirically, this variation matters (although it is subsumed by area fixed effects, 
and thus hard to tightly distinguish from other area traits). The importance of racial concentration does not 
point to social trust specifically, but it is consistent with it, and is difficult to explain with homophily alone. 

Our final tests link time-series evidence of declining fertility rates within the U.S. to changes in 
diversity, related to the two motivating facts with which we began. The level of identification for time series 
changes is much weaker, but because the time series patterns are so stark, and the pattern so poorly 
explained in quantitative terms, it is an interesting question whether our cross-sectional evidence has 
enough bite to potentially be a driver of overall birth rates. We find that the average racial isolation explains 
44% of variation in the US total fertility rate since 1971, and 89% since 2006. The predicted decline between 
2006 and 2021 based on the coefficients is 0.426 children per woman, very close to the actual decline of 
0.444. Diversity is large enough as a factor to potentially explain a large amount of birth rate time series 
variation, especially the most puzzling changes in recent years. 

We argue that our evidence implies that diversity and birthrates have some fundamental tension 
between them. While homophily (and to a lesser extent, social trust) appear to be contributors to this 
relationship, it is less clear that they are the sole driving factors. Even if the pattern merely reveals other 
underlying factors that are not tied to race differences directly, these patterns are important to understand, 


as diversity and birth rates are some of the most important demographic changes of our age. 


2. Literature Review 
Our paper is related to literature on the economics of fertility. This starts with the seminal work of 


Becker (1960) which explained the decline in fertility during industrialization in the 19" century partly by 
p 


the declining value of children for agricultural work.' Becker and Lewis (1973) propose a quantity-quality 
tradeoff theory between having more children and investing more resources (e.g. education) into each one. 

Recent surveys by Doepke and Tertilt (2016), Greenwood et al. (2017), and Doepke et al. (2022) 
describe the determinants of fertility. Women’s decisions to have children are related to their labor market 
opportunities (Adsera 2005), and thus affected by drivers such as taxation (Guner et al. 2012, Bick and 
Fuchs-Schiindeln 2017, and Borella et al. 2021), and access to education (Black, Devereux and Salvanes 
2008, McCrary and Royer 2011). Fertility rates are also related to government spending on early childhood 
education programs (Olivetti and Petrongolo 2017), which function as a form of childcare, and such access 
is especially important for the decision to have multiple children (D’ Albis et al. 2017). 

Other research considers the role of family planning. Goldin and Katz (2002) argue that improved 
access to birth control for single women in the 1970s increased women’s incentive to invest in a career and 
delay marriage and childbearing. Kearney and Levine (2009) find similar fertility effects from Medicaid 
subsidies of contraception, and Myers (2017) finds fertility effects from abortion access. Finally, other 
papers have examined costs of family formation, including child car seat laws (Nickerson and Solomon 
2024), and mortgage deregulation (Hacamo 2021). Fertility is also affected by cultural influences like social 
attitudes to mothers working, (Kleven et al. 2019) and TV shows (Kearney and Levine 2015) 

Relative to this literature, our paper makes several contributions. First, existing theories have 
considerable difficulty explaining the large and consistent decline in fertility in the US since 2007. As 
Kearney, Levine and Pardue (2022) describe it: *The Great Recession contributed to the decline in the early 
part of this period, but we are unable to identify any other economic, policy, or social factor that has 
changed since 2007 that is responsible for much of the decline beyond that." We answer this challenge, 
and provide an explanation that is both new to the existing literature, and can potentially explain in 


quantitative terms the recent declines. Diversity differs conceptually from most of the existing birth rate 


! This coincided with increasing economic growth, thus the prima facie puzzling inverse relationship between income 
and fertility. In recent decades, high-income countries no longer exhibit a negative correlation between income and 
fertility (Hazan and Zoabi 2015 and Bar et al. 2018). 


drivers by being a property of a local area, whereas many of the studied aspects are narrow costs or benefits 
more closely related to child-raising and its substitutes that are targeted at the individual level. 

Second, our findings provide an alternative explanation for the empirical demographic pattern 
whereby immigrants from high-fertility countries tend to converge over time to lower native levels (Dubuc 
2012, Parrado and Morgan 2008, Sobotka 2008, Mulder and Wagner 2001, White, Moreno, and Guo 1995, 
Adsera and Ferrer 2015). The existing literature has mostly emphasized aspects of cultural transmission, 
but our results suggest an alternative mechanism — immigrants are typically moving from places where they 
have a high race share, to places where they have a low race share, and thus are expected to have lower 
fertility in subsequent generations. 

Our paper also contributes to the research on homophily and matching in partner traits. As well as 
the previous cited literature on same-race preferences, there is also evidence of assortative matching based 
on income (Chiappori, Salanie and Weiss 2017, Greenwood et al. 2014, Fernandez et al. 2005, Schwartz 
and Mare 2005, and Chiappori et al. 2022). Our paper shows an important implication of such matching - 
when there is more local diversity along that dimension, marriage rates and birth rates are lower. While 
most of our evidence is about racial diversity, we find evidence for income-based diversity effects as well. 

Our paper is also related to the literature in political economy on the relationship between ethnic 
diversity and social trust. This documents negative connections between ethnic diversity and favorable 
outcomes, such as civic engagement (Costa and Kahn 2003), the provision of public goods (Alesina et al., 
1999), and self-reported trust levels (Putnam 2007). Dinesen et al. (2020), in their comprehensive meta- 
analysis, document a significant negative correlation between ethnic diversity and social trust across 1,001 
estimates derived from 87 studies. They surmise the consistent negative correlation observed across various 
types of social trust aligns with Putnam's (2007) theory of anomie (social isolation), which posits a 
universal decline in trust across diverse social settings. Perhaps surprisingly, none of these studies discussed 
in the meta-analysis have examined fertility as an outcome. We contribute to these studies by documenting 


a new social outcome of diversity, and show a link to both trust and homophily as potential drivers. 


2. Data and Variable Construction 
2.1 Data Sources 

Census data is obtained from IPUMS, a service of the Minnesota Population Center, which 
aggregates and standardizes census data from both US and international sources. U.S. census data is taken 
from decennial censuses from 1850 to 2010, plus yearly vintages of the American Community Survey 
(ACS) from 2000 through 2021. 1960 is excluded due to lacking local geographic information. International 
Census data is taken from IPUMS International, for nearly all samples where race data is non-missing (The 
United Kingdom, Mozambique, South Africa, Zimbabwe, Costa Rica, Cuba, El Salvador, Jamaica, Brazil, 
Colombia, Ecuador, Uruguay — we exclude very small samples from Suriname and Saint Lucia). U.S. Total 
Fertility Rate data and economic indicators (unemployment, GDP growth, inflation) are taken from the 


FRED website of the Federal Reserve Bank of St. Louis. 


2.2 Main Variables 

The main results of the paper relate local levels of racial diversity and racial isolation to birth rates. 
To do so, we have to unpack each of the component pieces — “birth rates", “local”, “racial” and “diversity”. 

For “birth rates", we use this term to refer to the number of children a woman has living in her 
house at the time of the survey. We are not specifically measuring a rate, but the numbers will (in most 
specifications) control flexibly for age, among many other variables. While it would be possible to turn 
these numbers and ages of children into annual birth rates (as in Nickerson and Solomon (2024)), for most 
of the sample the race measures are only available at the same time and in the same survey as the birth 
counts, meaning there is not much ability to match diversity levels to birth choices at the time of conception. 

By "local", we conceptually refer to the community where the person resides, acknowledging that 
there is no single universally correct or optimal measure to define this. With arbitrarily fine data, one could 
imagine that the effect of houses nearby is different from the whole street, the neighborhood, the town, the 
county, and the state. Ideally, all these levels could be evaluated. With public census data, things are 


complicated on two dimensions. First, the collection of different geographic levels varies over time. As the 


Data Appendix describes, the 1850 census collects information on city and detailed metro area. County 
information first starts in 1950, and metro area information ends in 2011. Some samples measure both city 
and county, others measure only one or the other. Even when multiple levels are available simultaneously, 
they do not nest each other. That is, there are cases of multiple counties within a city, as well as multiple 
cities within a county. Because county is generally finer (i.e. taking respondents where both city and county 
data are non-missing, the average number of cities per county is 5.35, whereas the average number of 
counties per city is 1.10), we take as our baseline measure 

-First city, if this is available 

-If no city information is available, then county (if this is non-missing) 

-If neither city nor county is available, then detailed metro area. 

-If none of these are available, the observation is dropped in the main analysis. 

For our state-level diversity variables, because these do not require finer geographical information, we 
include all residents of a state (even if they lack other geographic information). 

Second, “racial”. There are numerous different ways to classify race. When using U.S. data, we 
follow the Census race classifications. For broad race measures, they include nine categories — white, 
black/African American, American Indian or Alaska native, Chinese, Japanese, other Asian or Pacific 
Islander, other race, two major races, and three or more major races. In addition, they also ask about 
Hispanic or Latino ethnicity, which interacts with the above. So, one can be Hispanic white, Hispanic black, 
Hispanic other, or any other combination.” 

The aim here is to map to how people construct their own identity. As our primary grouping, we 
put all Hispanic/Latino respondents in a single, separate category. In this respect, our shorthand use of 
"race", unless otherwise qualified, refers to these ten groupings (the nine Census broad race groups, plus a 


tenth for Hispanic/Latino). The understanding is that this combines aspects of both race and ethnicity, in 


? In 2021, the most common race labels self-chosen by Hispanic/Latino respondents are “other” or “two major races", 
due the U.S. classifications lacking a racial category that corresponds to Amerindians from Central and South America 
(whereas census race definitions in other countries include “Indigenous” in Brazil, Colombia, Costa Rica, Ecuador, El 
Salvador and Uruguay, and mixed-race versions like “Mestizo” in Ecuador, El Salvador, and Uruguay). 
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terms of how people construct their sense of identity. Including Hispanics/Latinos as a single group 
implicitly assumes that their sense of what it means to be surrounded by people “like them” covers other 
Hispanic/Latino people (rather than, say Hispanic whites feeling that Hispanic other are a different group). 
Some categories are unsatisfactory for this purpose no matter how it is done — it seems unlikely that “three 
or more major race” respondents only feel a sense of similarity with other “three or more major race” 
people, who may have entirely different combinations of race. All these measures are imperfect, and we 
later explore a number of other definitions, but using them does not materially affect our results. 

Finally, the last metric is diversity and racial isolation. Both are linked by the idea of being 
surrounded by people who differ from you. Our notion of diversity relates to there being many different 
groups who are each a small fraction of the population — that is, the overall level of racial concentration. 
Importantly, we do not mean “diversity” merely as a shorthand for “not white”. This alternative measure 
aligns more closely with the race control variables themselves. We later also employ race as an interaction 
term in our analysis. Our main version of diversity using a Herfindahl index of racial concentration —the 
sum of the squares of the fraction of the population made up by each race in that area and year. 

For racial isolation, we focus on the concept of being a small fraction of the population. We measure 
this using the race share variable, which represents the proportion of the local population with the same 
race as the woman in question. This is mechanically related to race Herfindahl, as shares are always zero 
or positive, and so race share squared (the addition to a Herfindahl) goes up with race share. The principal 
difference is that a race Herfindahl applies to everyone in an area, and so does not have any variation within 
an area and year. In this sense, while the Herfindahl measure maps most closely to the ordinary definition 
of “diversity”, it is necessarily hard to disentangle from other attributes in that community and year that 
greater variation in races may be associated with. 

Given these considerations, we opt to use race share variable as our primary measure of racial 
diversity. There is a maintained assumption, which is hard to test, that race share and race Herfindahl are 
capturing similar underlying concepts — that is, that being a small racial group within your area draws on 


the same underlying mechanism as living in an area with many other different races. The consistency in 
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the direction of the results obtained using both variables strengthens the overall findings, even if the specific 
underlying mechanisms captured by each measure may differ. In practice, the results of the paper work 
similarly under either measure. The main difference is that race share allows for the addition of an area by 
time fixed effect. That is, we can control for all possible drivers of the number of children in a given area 
and year, and focus only on the difference between being part of a large racial group versus a smaller one. 

Finally, in our base specifications, we measure race share and Herfindahl for the population aged 


eighteen and over, so that the number of children is not mechanically linked to attributes of those children. 


3. Results 
3.1 Base Effects of Diversity on Birth Rates 
We begin by relating diversity and racial isolation to birth rates. Our main specification is: 
Number of Children;;, = a + bi * RaceShare;;, + b» * Controlsij: + eij: 

Observations are taken for a woman i, living in area j, in year t. The list of controls varies according 
to specification, and we introduce them as they are added. 

Table 2 Panel A presents the baseline results of race share on number of children. Column 1 is a 
univariate regression with no controls. In this specification, RaceShare is positively associated with the 
number of children, with a coefficient of 0.159, and significant at the 1% level with a t-statistic of 2.84 
(with standard errors clustered by state and year). This regression includes years dating back to 1850. In 
terms of the economic magnitude, a one standard deviation increase in RaceShare is associated with the 
woman having 0.052 more children, on average. 

Column 2 adds controls for Race (that is, the nine Census racial groups plus a tenth for 
Hispanic/Latino). RaceShare 1s necessarily correlated with race itself, as being a more numerous racial 
group (such as whites) makes it more likely that you will live nearby more people of the same race. When 
this is controlled for, the effect becomes larger and much more significant — the coefficient is now 0.708, 
with a t-statistic of 6.89. Intuitively, once we control for the fact that whites have a high race share in 


general, and low birth rates in general, the effect of RaceShare increases greatly. That is, comparing two 
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women of the same race shows a large effect of RaceShare on their number of children. We report two 
effect sizes. The first is the effect of one unconditional standard deviation of RaceShare (from column 1) 
multiplied by the coefficient. This is 0.230 more children, in this case. The second calculates the effect of 
a conditional standard deviation in RaceShare. That is, we regress RaceShare on the same fixed effects in 
the regression (here, just Race), and compute the standard deviation of the residuals. One standard deviation 
of this is associated with 0.123 more children. The difference between these two measures is approximately 
whether you use all the variation in RaceShare (and assume it has the same effect as the aspects already 
controlled for), or whether you just use the part remaining after stripping out the controlled-for components. 

Column 3 adds fixed effects for State and Year. The coefficient is reduced to 0.435, but the t-statistic 
is similar at 6.92. The unconditional and conditional effects of a one standard deviation increase in 
RaceShare are 0.141 and 0.067 more children, respectively. Column 4 adds fixed effect controls for various 
demographic variables, collectively referred to as Demographics. This includes Race as before, but also 
categories for the woman’s age, marital status, nationwide deciles of income, employment status, education, 
and citizenship status. With the full set of demographic controls, the earliest year for observations is now 
1980. As before, the coefficient is reduced, to 0.310, but the t-statistic is increased to 7.30. Unconditional 
and conditional standard deviation effects are now 0.101 and 0.049 more children, respectively. 

Column 5 keeps the Demographics variables, but replaces the State and Year fixed effects with an 
interacted State- Year fixed effect. The coefficient is now 0.241, with a t-statistic of 8.58. Column 6 replaces 
the baseline Demographics variables with interactions of Demographics *State and Demographics *Year (in 
addition to the State- Year effects). The coefficient increases to 0.291 with a t-statistic of 7.03. 

Column 7 adds two new sets of controls related to local area metrics. First, we add dummy variables 
for the area type (county, city or metro area). Second, we calculate other metrics averaged at the local area: 


the fraction employed, the fraction college educated, average age, average income decile, and a z-score for 
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the fraction of people who have moved in the last one or five years, depending on data availability.’ We 
collectively refer to these as Area Traits. Unlike other controls, these are calculated as linear effects. Adding 
these variables reduces the coefficient to 0.204, with a t-statistic of 6.58. The effect of an unconditional and 
conditional standard deviation of RaceShare is now 0.066 and 0.026 children, respectively. 

Column 8 replaces the variables for area type with dummies that split each area type into population 
buckets (that is, creating Area Type * Population Group), where the grouping is either halves, quintiles or 
deciles, depending on the number of observations.* When comparing population characteristics across 
different geographic units, it is important to ensure that the units are comparable. The population of a city 
may not reflect the same meaningful “size” as the population of the surrounding metropolitan area. Instead, 
we operate under the assumption that populations of cities can be compared to other cities within the same 
year, counties with counties, etc., as they are comparable units. The coefficient is now 0.160, with a t- 
statistic of 5.00. Unconditional and conditional standard deviation changes in RaceShare result in 0.052 
and 0.019 more children. Column 9 allows all area controls to be time-varying, so we replace Area Traits 
with Area Traits * Year and Area Type * Population Group * Year. Column 10 adds an Area fixed effect 
(e.g. for Cook County, Illinois). In both cases, the coefficients and significance are very similar to before. 

Finally, we absorb all variation at the level of the area and year in Column 11, adding in Area * 
Year fixed effects. These replace all of the other area level controls (Area Traits *Year, and Area Type * 
Population * Year), as well as the State*Year fixed effects. In this specification, the only controls are 
Demographics * (State, Year) and Area * Year, with everything else being absorbed. The coefficient is 
largely unchanged from before, being 0.197 with a t-statistic of 5.88. An unconditional and conditional 


standard deviation increase in RaceShare results in 0.064 and 0.020 more children, respectively. 


? Different census years list either whether the respondent has moved in the last year, or the last five years. We first 
compute the average of each of these at the local area level. To make these comparable across years, we convert each 
into a z-score across all local areas that year. If both are available, we average the two. 

^ Bach category (county, city, metro area) is split into percentiles based on the number of respondents that year from 
the area in question. If there are 20 or fewer area type observations in that year (e.g. fewer than 20 cities in the data 
that year), areas are split into high and low populations. If there are between 21 and 100 area type observations, they 
are split into quintiles. If there are more than 100, they are split into deciles. 
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With Area* Year fixed effects, we control for many general properties of an area and year that might 
influence birth rates. All the variation comes from differences in RaceShare between different groups in an 
area (1.e., comparing racial groups that are more numerous in that area versus less numerous). Because we 
also have Race*Year and Race*State (as part of the Demographics * (State, Year)), we are also comparing 
each group with the overall birthrate of that racial group in that state, and in that year. For example, if we 
consider Detroit, MI in 2007 (which is predominantly black), blacks in the city will have more children 
relative to blacks in Michigan generally, or blacks in 2007 generally, and whites in Detroit will have fewer 
children. Meanwhile, in Ann Arbor, which is predominantly white, the pattern will be reversed — whites 
will have more children than elsewhere in the state and year, and blacks will have fewer children. Other 
individual-level differences in the populations are controlled for in Demographics * (State, Year). 

Because of this, our results cannot be attributed to any area-level traits that might be associated 
with diversity in general. That is, 1f more diverse areas are richer or poorer, have more jobs or fewer, are 
denser or sparser, or anything else — all of this 1s controlled for by the Area *Year effects. Racial isolation 
is now separate from the general level of diversity of an area, which is also absorbed. It 1s notable that the 
final step of Area *Year fixed effects changes the results very little. Controlling parametrically for the other 
aspects of the area produces very similar results to flexibly controlling for it with fixed effects. 

Because the RaceShare variable includes both the “lots of different races in an area, each being 
small" aspect, and the “you personally are from a smaller group" aspect, we next turn to a version that 
captures only the first aspect. In Panel B, we replace the person's race share with a Herfindahl index of the 
different races in that area and year. This approach aligns more closely with the common understanding of 
diversity, but it comes with the limitation that the resulting measure only varies at the area-by-year level. 

Column 1 shows that without controls, RaceHerfindahl positively predicts birth rates, with a 
coefficient of 0.590 and a t-statistic of 5.33. The unconditional effect of a one standard deviation increase 
in RaceHerfindahl (1.e., an area becoming more racially concentrated) is associated with 0.125 more 
children. The effect increases in magnitude and significance when adding race controls in column 2. 


Column 3 adds State*Year fixed effects, and the coefficient is 0.664, similar to the univariate specification, 
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now with a t-statistic of 15.02. Adding Demographics*(State, Year) in column 4 reduces the effect to a 
coefficient of 0.342, and a t-statistic of 14.95. Unconditional and conditional standard deviation changes in 
RaceHerfindahl are associated with 0.073 and 0.044 more children, respectively. Adding Area Type and 
Area Traits reduces the effect somewhat to 0.207 in column 5. Adding Area Traits * Year and Area Type * 
Population * Year in column 6 gives an effect of 0.121, with a t-statistic of 3.91, and effect sizes of 0.026 
and 0.011 for unconditional and conditional standard deviation increases in RaceHerfindahl. 

Next, we add Area fixed effects. Relative to Panel A, it is less clear what the right level of controls 
is. In the limit, adding in Avea*Year will absorb all the variation, so this is not possible. Nonetheless, even 
when absorbing the average level of RaceHerfindahl via an Area fixed effect, we still find a positive (albeit 
smaller) and significant effect of 0.037, with a t-statistic of 4.16. The unconditional and conditional effects 
of a one standard deviation change in RaceHerfindahl are now 0.008 and 0.001 more children, respectively. 

In Panel C, we include both RaceShare and RaceHerfindahl in the same regression. The 
specifications are the same as those in Panel B. In general, both variables show positive effects that are not 
subsumed by the other. The only exceptions are in column 1 (with no controls), where RaceShare loads 
negatively when race controls are absent, and in columns 6 and 7, where the addition of Area fixed effects 
and the RaceShare variable means that RaceHerfindahl is either zero or negative. In general, coefficients 
are somewhat reduced relative to the specifications with only one or the other variable, which makes 
intuitive sense given that the two variables have decent overlap, both conceptually and empirically. 

One potential concern with Panel C is that RaceHerfindahl could be picking up non-linear effects 
of RaceShare, rather than a separate effect of concentration. Because RaceHerfindahl is made up ofthe sum 
of squared race shares, if RaceShare has additional effects beyond the linear specification we use, this may 
lead to RaceHerfindahl having measured effects even if racial concentrations do not matter directly. To rule 
out this possibility, in Panel D we replace the RaceHerfindahl with a different version, 
OtherRaceHerfindahl, which is the Herfindahl index just computed across all other races than the 
respondent's. In other words, this alternative version is orthogonal to whatever the respondent's own 


RaceShare is, and just represents the concentration of the remaining population. The results are generally 
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similar to those in Panel C, but somewhat stronger — the negative effect of RaceShare in column 1 
disappears (and it is now positive and significant), while the zero and negative coefficients in columns 6 
and 7 become positive and zero, respectively. 

While RaceHerfindahl appears to be an important separate driver of birth rates, it is harder to 
distinguish from other area-level effects. This is seen in column 7, where including an area fixed effect 
causes both RaceHerfindahl and OtherRaceHerfindahl to lose their significance in Panels C and D. In other 
words, when both the area average level of RaceHerfindahl (or OtherRaceHerfindahl) is controlled for by 
an area fixed effect, and the level of RaceShare is controlled for, the remaining variation in racial 
concentration does not drive fertility. Recall that both sets of controls are necessary, however — in Panel B, 
RaceHerfindahl on its own still has significant effects with an area fixed effect included (but with 
RaceShare absent). For this reason, we argue that the bulk of the evidence supports the conclusion that 
racial concentration matters, over and above the level of the respondent's own racial share in the population. 
Nonetheless, a reader who is skeptical of what racial concentration is measuring absent the inclusion of an 
area fixed effect may not be convinced of a separate role for racial concentration. For this reason, in the 
remainder of the paper, we mostly focus on the RaceShare variable, due to the ability to add Area *Year 
fixed effects and get a tighter interpretation of what the variable measures. The results of the paper are 


generally similar 1f RaceHerfindahl is used instead, absent the inclusion of area fixed effects. 


3.2 Alternative Specifications 


Next, we explore a number of variations on the main specifications above. Table 3 Panel A 
constructs versions of the RaceShare variable using alternative definitions of race. These are 1) omitting the 
Hispanic/Latino category, ii) using detailed race (instead of broad race) and omitting Hispanic/Latino, iii) 
using broad race and including Hispanic/Latino as an interaction rather than a separate category, iv) using 
detailed race and including Hispanic/Latino as an interaction, v) using ancestry share, instead of any race 


classification, and vi) using the base definition over the whole population, including those under eighteen. 
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For each variable, we include Area*Year and Demographics*(State, Year) controls (corresponding to Table 
2 Panel A column 11). The effects are positive and significant at the 1% level in all cases. 

Panel B constructs the RaceShare variable at different geographical levels. As geographic 
information varies across census years, limiting the analysis to only one type of geography alters the range 
of years included. To ensure that the controls remain comparable, in columns 1-4 we include State*Year 
and Demographics *(State, Year) (as other controls require area level information, which is not available 
for all specifications). Recall that in the base case, levels are constructed sequentially based on availability, 
so that (for instance) county is only used if city information is missing. For our first three measures, we use 
i) city, ii) county, and iii) detailed metro area, for all observations in each respective category. Next, iv) we 
reverse the priority order of city and county, so using county first, then city, and finally detailed metro area. 
In columns 1-4, all the relationships are positive and significant. We also v) use state level measures in 
columns 5 and 6, thus including all observations from the state, even those with missing information on any 
finer geography. For these two specifications, we omit State*Year controls, as these would map to within- 
area versions of the variable (and thus not be comparable to the earlier columns). State level measures are 
only significant at the 10% level, however, in column 5. This suggests that it is more local geography that 
drives these effects. Consistent with this notion, in column 6 we add both the baseline RaceShare variable 
and the state level version in the same regression. The baseline local version is highly significant while the 
state level metric exhibits a somewhat negative effect. 

Panel C explores different levels of weighting. The baseline regressions weight every response 
equally, which necessarily draws more observations both from larger population areas, and from recent 
years. In column 1, we weight every Area*Year combination equally, regardless of the number of 
respondents. In column 2, we weight every year equally. In column 3, we weight each year equally, but also 
weight each observation in that year according to the census household weights. In columns 4-6, we apply 
census household weights within the area when constructing the RaceShare variable, and apply the same 
observation-level weighting choices as before. The results are positive and highly significant in all 


specifications, which include Demographics *(State, Year) and Area *Year. 
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3.3 Mobility and Selection 


We now turn to tests designed to shed light on what the baseline result of the paper is measuring. 
One class of explanation is selection effects based on mobility. When people have children, or are thinking 
about having children, they may desire to be in areas with more people of their own race, even if that does 
not directly affect how many children they have. This could come from a direct preference for being around 
people of the same race (a social form of homophily), or being drawn to particular amenities in an area that 
are more favored by one race over another. If these are complements to having children, then people might 
relocate because of the child choice, rather than the child choice being affected by the diversity. 

To test this, in Table 4 we re-run our tests using various metrics of women who are less likely to 
have moved. If a woman has not moved at all, then it is not a concern than she moved based on diversity 
and fertility decisions. Information on mobility-related questions is collected unevenly over years, so we 
measure mobility in different ways, limiting the sample to women who are less likely to be mobile. All 
regressions include controls for Demographics *(State, Year), and Area* Year. 

When limiting the sample to women who are less mobile, we still find positive and statistically 
significant effects at the 1% level in all specifications. Column 1 limits the sample to women living in their 
state of birth. Recall that the coefficient on RaceShare in the analogous specification (Table 2 Panel A 
Column 11) is 0.197. For women living in their state of birth, the coefficient is slightly lower, at 0.161. 
Column 2 limits the sample to women who haven't moved in the past year. The effect is similar to Table 2, 
at 0.200. Column 3 limits the sample to women who haven't moved in the past five years, and finds a 
somewhat lower coefficient of 0.124. Column 4 takes women who either haven't moved in the past year, 
or haven't moved in the past five years (with surveys generally asking either one question or the other, but 
not both). The effect is 0.191. Finally, if any of the three measures of being less mobile is grounds for 
inclusion, the effect is 0.190, with a t-statistic of 5.42. The robustness of the effects across all subsamples 


suggests that mobility and selection are unlikely to be the primary drivers of the relationship between race 
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share and birth rates. While these factors may contribute to the overall effect, as evidenced by the slightly 


lower coefficients in some specifications, they do not appear to be the dominant explanation for the findings. 


3.4 Time Periods 

Next, we consider the effect across different time periods. While this is not a direct test of a specific 
mechanism, the very long time period of our data allows us to implicitly test the importance of a variety of 
different theories. For instance, one might imagine that the effect is concentrated in the Obama presidency, 
or the Civil Rights Act, or policies during Reconstruction. In Table 5, we evaluate the baseline effect in 
different periods of the U.S. census dating back to 1850. The set of controls here is limited by the available 
in the early periods — to ensure comparability, in all years we use the controls available in 1850. In Panel 
A, this is State*Year, and (Age, Race) * (State, Year). In Panel B, we also include an Area* Year fixed effect. 

The periods studied include 1850-1860 (column 1), 1870-1890 (column 2), 1900-1940 (column 3), 
1950-1970 (column 4), 1980-1990 (column 5), and 2000-2021 (column 6). In Panel A, we find large and 
significant results in all specifications. In terms of magnitudes, the coefficients in Panel A are generally 
decreasing across over time, ranging from 1.889 in 1850-1860, to 0.515 in 2000-2021. However, due to the 
smaller number of observations in the early period, the significance is lower, with 1850-1860 having a t- 
statistic of 2.01, significant at the 10% level (with all other periods significant at the 1% level). If 
magnitudes are measured in terms of marginal effects of a one standard deviation change in race share, the 
largest effect is in 1850/60 with 0.33 children, decreasing to a marginal effect of 0.158 in 2000-2021. The 
higher variation in RaceShare in later years offsets some of the decrease in coefficients, so the difference 
in marginal effects is not as large. 

Panel B includes Area*Year fixed effects. Now the first several columns are no longer statistically 
significant, with significance being stronger starting in 1950. Interestingly, the coefficients now somewhat 
increase over time, although they are stable from 1950 onwards. In this respect, it is not clear what to infer 


about the magnitude of the effect over time, as the answer depends on what level of controls 1s applied. 
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Despite being not directly tied to a particular theory, Table 5 is in fact greatly constraining of what 
explanations can be operating. If one assumes that the same pattern in the data is driven by the same cause, 
then when variation between areas is included, that cause must be operating before the Civil War, during 
Reconstruction, during the Gilded Age, during both World Wars, during the Civil Rights Era, at the end of 
the Cold War, and throughout the 21“ century. Even if one only believes specifications using just within- 
area variation, the cause must be present since 1950. Theories that emphasize contemporary aspects of race 
relations, regardless of the specific aspect they focus on, will generally face challenges in explaining the 
pervasive presence of this effect throughout U.S. history. The consistency of the observed relationship 
between racial diversity and birth rates across various historical periods suggests that the underlying 
mechanisms are likely to be more fundamental and deeply rooted than those captured by theories primarily 


concerned with current racial dynamics. 


3.5 Effects Across Races 

To further investigate potential mechanisms, we examine how the baseline effect varies across 
different racial groups. While the main results control for race (and its interactions with state and year) as 
a determinant of birth rates, here we focus on the interaction effects. Many theories about the impact of 
diversity primarily emphasize black/white race relations, and concepts like the historical legacy of slavery. 
An important test for such theories, if they are indeed the primary drivers of the observed effect, is what 
prediction they make for other racial groups. By examining the interaction effects across a wide range of 
racial groups, we can better assess the applicability and explanatory power of theories that predominantly 
focus on specific racial dynamics. 

We consider these possibilities in Table 6. We run a similar set of specifications to Table 2 Panel B 
(though as we include race interactions, all specifications require Race fixed effects). When thinking about 
the effect across different races, there are two ways to consider the effect: 

-Does it hold in the most stringent specification (i.e. with Area*Year fixed effects?) 


-Does it hold across a wide range of different specifications? 


21 


To begin with the first aspect, namely the effect across races in the most stringent specification, 
Table 6 Column 7 shows the results for interactions of RaceShare with all ten racial groups, after adding 
controls for Demographics *(State, Year) and Area* Year. Given that these ten racial groups encompass all 
the possible categories for the baseline regressions, the ten interactions subsume the base effect, and thus 
the interpretation is whether RaceShare variable has a significant effect on birth rates specifically within 
that racial group. The results show that the effect is positive and significant at the 10% level for 7 out of 10 
groups (with only Hispanic, other Asian / Pacific Islander, and three or more races not being significant). 
In terms of magnitude, whites show the fourth smallest effect, and two races and other have the largest. 

However, when considering the consistency of the effect across different specifications for each 
racial group, a somewhat different picture emerges. For white respondents, the effects are positive and 
significant in every specification, which is relatively unsurprising, given the baseline result for all 
respondents is very strong, and whites constitute the largest racial group in the sample. In contrast, the 
consistency of the effect varies more notably across specifications for other racial groups, suggesting 
potential differences in the robustness of the relationship between racial diversity and fertility depending 
on the level of controls and specific racial population being examined. Results are generally also positive 
and significant for Native Americans/Indians, blacks, and two races. Effects are generally positive but not 
always significant for Hispanic other and three or more races, with the only significant values being 
positive. Other Asian/Pacific islander is insignificant in all specifications. Japanese and Chinese switch 
from positive and significant to negative and significant across specifications. 

The interpretation of the most stringent specification is the clearest, namely that the effect is present 
in some degree for nearly all racial groups once the largest number of other alternative drivers of birth rates 
are accounted for. However, the interpretation of the other specifications is less clear, as it requires either 
taking a definitive stance on the precise (and smaller) number of controls that should be included or simply 
considering the generality of the results. Even in this case, it is not clear what explanation would show the 


most reliable results for whites, blacks, native Americans, and two races. 
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3.6 International Results 

Another class of explanation is aspects that are unique to U.S. history or the U.S. context of race 
relations. An important test of such theories is whether the results are present in other countries. To test this, 
we use IPUMS international data, for nearly all countries that collect race information, excluding only Saint 
Lucia and Suriname where the small sample sizes make geographic measurements challenging. In Panel A, 
we consider African countries (Mozambique, South Africa, Zimbabwe) plus the UK (the only European 
country we observe). In Panel B, we consider Central American countries (Costa Rica, Cuba, El Salvador, 
Jamaica). In Panel C, we consider South American countries (Brazil, Colombia, Ecuador, Uruguay). 

IPUMS codes up geography at the coarse level and the fine level, which roughly corresponds to 
states and sub-state units (cities, counties, etc.). We measure RaceShare at the finest level available, usually 
fine geography, but sometimes coarse geography. 

For race definitions, we use the groupings collected by each country, with full details provided in 
the data appendix. When multiple samples are available for a given country, we add Year interactions where 
applicable. For each country, where possible we use three specifications: 

i) Coarse Geography* Year and Demographics. 

Here, coarse geography approximately corresponds to state. Our list of Demographics includes whatever is 
available out of: an urban dummy, race, marital status, age, educational attainment, and employment status. 

ii) Coarse Geography* Year, Demographics*(Coarse Geography, Year), Ln Population 

Density *Year 
We also include the natural log of population density, interacted with year. 
Finally, in iii) we also include Fine Geography *Year. 
Exact details of what 1s included for each country are in the data appendix. 

Panel A examines the effects for the UK and African countries, and finds a generally positive 
relationship, except when using Fine Geography *Year effects. The UK is unusual, having only a single 
sample and only coarse geography measures (so RaceShare is at the coarse geography level). Standard error 


clustering is also at the coarse geography level, or at the fine geography level where this is available (since 
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the small number of time periods makes clustering by time either inadvisable or outright impossible). 
Column 1 includes Demographics and Ln Population, and finds a positive and significant effect. This 
disappears in column 2, when Coarse Geography controls are added (which, recall, are at the same level as 
the RaceShare variable, so are more equivalent to the fine geography controls in other countries). In 
columns 3, 6 and 9 we find positive and significant effects for RaceShare in Mozambique, South Africa, 
and Zimbabwe, when controlling for Coarse Geography *Year and Demographics. In columns 4, 7 and 
10, these remain positive and significant, and actually increase in magnitude, when controls are added for 
Demographics*(Coarse Geography, Year) and Ln Population Density*Year. When controls for Fine 
Geography *Year are added (looking only at variation between races in the same area, like Table 2 Panel A 
Column 11), the effect is insignificant in Mozambique and Zimbabwe, but still significant in South Africa. 

Panel B shows effects for Central America. Costa Rica and Cuba show mixed results, being zero 
in the first specification, negative and marginally significant when adding Demographics*(Coarse 
Geography, Year) and Ln Population Density*Year, but positive and significant when adding Fine 
Geography * Year. El Salvador shows a similar pattern to Panel A — positive and significant in the first and 
second specifications, but insignificant when Fine Geography *Year effects are added. Jamaica shows 
insignificant effects in all specifications. 

Panel C shows the effects for South America. Brazil shows the only reliably negative effects across 
all specifications. These are small in magnitude compared with other countries, but the large number of 
observations (over 16 million) makes them significant. Colombia shows insignificant results in all 
specifications. Ecuador shows the Panel A pattern, of being positive and significant in the first two 
specifications, but insignificant when Fine Geography *Year controls are added. Uruguay is negative and 
significant in the first two specifications, but positive and significant in the third. 

Overall, these results show that the effect in international data is not as ubiquitous as in the U.S., 
but neither is it limited only to U.S. data. The number of countries where the results are “some positive and 
significant, none negative and significant” is six, whereas “some negative and significant, none positive 


and significant” is only one. Moreover, the pattern of which countries show positive effects does not tell an 
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obvious story. While U.S. descriptions of race relations often focus on the interaction between black and 
white populations, the results are present in the UK (which has far fewer blacks), and African countries 
(which are overwhelmingly black), but not in Jamaica (which is also overwhelmingly black) nor Brazil 
(which has a large black population by U.S. definitions, but also a rather different conception of racial 
categories). We consider the question of understanding all these sources of variation to be interesting, but 
beyond the scope of the paper. Instead, these results serve as an indication that the results may arise from 


forces that occur in other counties as well, but not universally so. 


3.7 Potential Explanations — Marriage and Divorce 

Next, we explore the implications of racial diversity for other relationship outcomes, particularly 
those relating to marriage. One version of the baseline explanation is that the effect of diversity on the 
number of children is narrowly related to some aspect of the costs of raising children, and thus represents 
a decision only along this dimension. If instead the impact on the number of children is related to broader 
issues that affect relationships, then we might expect to see effects on rates of marriage and divorce. These 
additional relationship outcomes are useful tests of the underlying mechanisms, being consistent with both 
trust and homophily explanations. At a minimum, if the effects extend beyond childbearing decisions and 
influence marriage formation and dissolution, it suggests that the role of racial diversity in shaping family 
structures and dynamics is more nuanced than a narrow focus on the costs of childrearing might suggest. 

We analyze this question in Table 8. In Panel A, we consider the probability that a woman is 
currently married at the time of survey. Using the same specifications as in Table 2 Panel B, we find that 
higher levels of RaceShare are associated with a greater likelihood of the woman being currently married, 
significant at the 1% level in all specifications. The effect sizes for an unconditional change in RaceShare 
on probability of being married range from 1.2 to 6.2 percentage points (in columns 7 and 1 respectively). 
In Panel B, we instead consider the effect on whether a woman was ever married (that is, the variable equals 


one if the woman is married, divorced or widowed, and zero if she is single). The effects here are generally 
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similar in magnitude and statistical significance to Panel A. A one standard deviation unconditional increase 
in RaceShare is associated with higher chances of ever being married by 1.2 to 6.2 percentage points. 

Panel C examines the age at first marriage. This is for a subset of the data, as we limit the sample 
to women who are currently married, and who have only been married once (as others lack data on the age 
of first marriage). The effects here are the most reliable of the four panels. A one standard deviation 
unconditional increase in RaceShare is associated with getting married between 6.6 months earlier (column 
1) and 2.3 months earlier (column 7). 

Panel D examines the probability of being divorced, conditional on getting married. In other words, 
the dummy variable now equals one if the woman is divorced, and zero if she is married or widowed (with 
single now being omitted). While there are some effects of RaceShare on lower divorce rates in the early 
specifications, the effects are smaller and less consistent, with the versions with tighter controls showing 
no effect. A one standard deviation change in RaceShare results in divorce probability ranging from 1.9 
percentage points lower (column 1) to 0.4 percentage points higher (column 2). 

Overall, these results reinforce that diversity is negatively associated with marriage rates, with the 
main drivers being whether and when you get married, more so than whether you get divorced. This 
reinforces the conclusion that diversity is associated with broader relationship effects, rather than just child- 


rearing costs, narrowly defined. 


3.8 Sex and Race Differences in Interracial Marriage 

Next, we consider one of the channels that might contribute to a causal interpretation of the main 
result. In particular, we consider the role of homophily in relationship preferences. There is considerable 
evidence of a general tendency of people to prefer to marry someone similar to them. Similarity in terms of 
race is one of the strongest of these. If people have a preference for marrying someone of the same race, 
then the fraction of the population around them who are of that race is an important determinant of the 
chances of them finding a suitable marriage partner. As well as there being evidence of assortative matching 


along racial dimensions (e.g., Hwang 2012), the evidence from sperm donation suggests a preference for 


26 


same-race traits in donors (Daniels and Heidt-Forsythe 2012), even when the man will not be present in the 
woman’s life (and thus correlated aspects like partner income cannot be driving the choice). 

Anumber of results already presented are consistent with this possibility of homophily as a driver. 
First, results about diversity being associated with the chances of getting married, and how long it takes to 
get married. Second, the fact that the across-area coefficients are declining over time in Table 4 (although, 
it is noted, the specification with Area*Year does not show this pattern), consistent with greater social 
acceptance of interracial relationships and marriage. 

It is tempting to attempt to address this problem by controlling for whether the woman married 
someone of a different race, but this is unlikely to be sufficient. If the pool of same-race partners shrinks, 
then one may end up instead marrying someone of the same race, but of a worse quality match than they 
might have otherwise gotten in a larger pool. The challenge lies in the fact that the characteristics 
determining the quality of a match are often difficult to observe and quantify. As a result, simply controlling 
for interracial marriage may not fully capture the impact of a reduced pool of same-race partners on the 
quality of marriages and on relationship outcomes such as fertility. To gain a more comprehensive 
understanding of the relationship between racial diversity and family formation, it is important to consider 
not only the direct effects on interracial marriage but also the more subtle ways in which partner availability 
and match quality may be affected by the size and composition of the pool of potential partners. 

We turn to two additional tests of this hypothesis. The essential component of homophily as an 
explanation is the preference for same-race marriage over interracial marriage. But this preference for 
marrying within one's own race is unlikely to be uniform across all racial groups and historical periods. To 
take a stark example — the census category of “three or more races” is unlikely to be a source of strong 
homophily, whereby people of three or more races only want to marry someone else of three or more races 
(regardless of what those three races actually are), rather than someone of two races (classified as a different 
racial group), or someone of one of those three racial groups, or anyone else. More broadly, different racial 


groups likely have different norms about the importance of marrying someone of the same race. If one is 
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from a group where interracial marriage is strongly discouraged, then it should be of larger importance to 
live around more people of the same race, so as to have a larger dating and marriage pool. 

One minor complication of this question is that the raw level of interracial marriage is somewhat 
mechanically related to RaceShare itself. For instance, in a state where whites make up 90% of the married 
population, it is simply not possible for them to have a large interracial marriage rate (whereas the 10% 
remaining population could, in principle, all marry someone of a different race). Instead, we compute the 
abnormal intermarriage rate, by comparing the actual intermarriage rate nationwide for that race and year, 
with the simulated distribution if all married people that year paired up randomly. We compute 1000 
simulations of random pairings, and compute a z-score of the actual intermarriage rate, minus the simulation 
average, divided by the simulation standard deviation. We interact this variable with the RaceShare variable. 
We add the univariate variable for abnormal marriage only in the specification that does not include 
Race* Year fixed effects, as these subsume the abnormal intermarriage rate. 

However, intermarriage rates may also be reflecting other differences across races that matter for 
other reasons. For this purpose, we turn to a second, sharper prediction, namely sex differences in 
intermarriage rates. In particular, men and women of the same race often “marry out" of their race at 
different rates. Each man that marries a woman of a different race reduces the pool of marriageable men for 
women of his own race. If the population sizes of the sexes are roughly equal, then neither sex will have a 
surplus of potential partners as a baseline. This highlights the essential aspect that intermarriage rates for 
men increase the pressure on women of the same race, and vice versa. Unlike the previous tests, any overall 
traits that are common to both men and women of that race, however arising, should not affect this rate. 

We measure this by calculating the interracial marriage gender ratio for each race and year. This 
ratio 1s obtained by dividing the fraction of married women of a given race who have a husband of a different 
race by the corresponding fraction of married men of the same race who have a wife of a different race. For 
these tests in Table 9 on intermarriage rate, and sex differences in intermarriage rate, we take both men and 
women ages 18-40 (as opposed to the other tables, which only include women). We replace the controls for 


State* Year, Demographics*(State, Year) and Area* Year with interactions with Sex, so that these effects can 
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vary between men and women (so we have Sex*State*Year, Sex*Demographics*(State, Year), and 
Sex*Area*Year respectively). The only exception here is that we cannot include Race*Sex*Year because 
this would absorb the variation we are using, so instead we use Race* Year and Race*State as before. 

These results are presented in Table 9. The first four columns show the effect of the abnormal 
intermarriage rate. We find that higher levels of abnormal intermarriage are associated with a lower effect 
of RaceShare. That is, RaceShare*AbnormallIntermarriageRate is negative and significant in three of the 
four specifications. The interpretation here is that when men and women of that race are more likely to 
marry people of different races, then it makes less difference to their average number of children whether 
they are living near people of the same race or not. The only exception to this is when we add controls for 
Sex*Area*Year, when the effect becomes smaller and insignificant. 

Next, we consider the effect of sex differences. Our main variables of interest are thus 
RaceShare*IntermarriageSexRatio and RaceShare*IntermarriageSexRatio *Male. The former estimates 
the effect of women marrying out more on women of that race, while the latter is the increased effect of 
RaceShare for men relative to women as the female rate of marrying out increases. 

In columns 5-8, we consider the effect of the /ntermarriageSexRatio and its interaction with Male. 
We observe that RaceShare*IntermarriageSexRatio *Male is positive and significant in all specifications. 
That is, when women marry out at higher rates relative to men, then it matters more for men whether they 
are living in a high RaceShare area or not, relative to how much it matters for women. The key prediction 
is the triple interaction, RaceShare*IntermarriageSexRatio *Male. This is positive and significant in all 
specifications. Even regardless of the other effects of the sex ratio, the reduction in fertility is greater for 


males. The coefficient on RaceShare*IntermarriageSexRatio is negative, but not generally significant. 


3.9 Trust 
Next, we turn to the second major theory that could explain our results, namely social trust. 
Higher levels of racial diversity are associated not just with lower direct levels of trust (i.e. survey 


respondents' answers as to whether you can generally trust people or not), but also with a variety of other 
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aspects of social capital related to trust. As Putnam (2007) describes: 
“In areas of greater diversity, our respondents demonstrate: 

* Lower confidence in local government, local leaders and the local news media. 

* Lower political efficacy — that is, confidence in their own influence. 

* Lower frequency of registering to vote, but more interest and knowledge about politics and more 

participation in protest marches and social reform groups 

* Less expectation that others will cooperate to solve dilemmas of collective action (e.g., voluntary 

conservation to ease a water or energy shortage). 

* Less likelihood of working on a community project. 

* Lower likelihood of giving to charity or volunteering. 

* Fewer close friends and confidants. 

* Less happiness and lower perceived quality of life. 

* More time spent watching television and more agreement that 'television is my most important 

form of entertainment’. " 

It is plausible that some or all of these factors are related both to the likelihood of people finding a 
suitable marriage partner, and their choice of how many children they would like to bring into the world. 
The predictions for this hypothesis are not as sharp as those for homophily. However, the two aspects that 
are the most straightforward are that 1) trust levels should be positively associated with birth rates, and ii) 
controlling for trust levels should reduce the effect of RaceShare. 

We consider two ways of measuring trust. The first is the direct measure used in the 2006 Social 
Capital Benchmark Survey, where respondents are asked “Generally speaking, would you say that most 
people can be trusted or that you can't be too careful in dealing with people?". We take the state level 
average fraction responding "people can be trusted" as a proportion of those answering either this response, 
or “you can't be too careful" (with responses of “it depends", “don’t know" or refusal to answer being 
omitted). Because we have the state level information only for 2006, we apply these survey responses to all 


years for that state. As a result, in these regressions, we cannot include state fixed effects, or state 
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interactions with other fixed effects. Instead, our controls are Demographics* Year. In this respect, the 
coefficients have much less controlled for than above. Nonetheless, our interest is how much these state 
metrics explain fertility, and how much they reduce the effect of RaceShare under this level of controls. 
We present these results in Panel A. Because our trust metrics are state level, we compare them to 
both state level versions of RaceShare (in columns 1-6), and with the baseline local version (in columns 7- 
12). Finally, because our trust measure is only from a single year, we also vary the time sample to be years 
increasingly matched to the timing of the trust measure — all years (columns | and 2), 2001-2010 (columns 
3 and 4), and 2006 only (columns 5 and 6). We find the first prediction confirmed — in all specifications, 
states with higher levels of general trust (StateLevelTrust) in 2006 have higher birth rates, when controlling 
for Demographics *Year. Second, we find that controlling for StateLevelTrust also reduces the effect of 
RaceShare(State) and RaceShare(Base). Columns 1, 3 and 5 compute the coefficient on RaceShare(State) 
for the years in question and observations where we can match StateLevelTrust data. Columns 2, 4 and 6 
show the same coefficient once StateLevelTrust is controlled for. In column 2, the effect of RaceShare(State) 
is reduced by 21% once we add trust controls. In the 2001-2010 sample, the reduction is slightly larger, at 
24%, in column 4. In 2006 alone, the reduction is slightly larger still, at 26%. Columns 7 to 12 show similar 
effects, albeit smaller reductions, when the baseline geography definitions are used. This is consistent with 
broad state-level trust measures having less ability to drive out geographically tighter measures of diversity. 
In Panel B, we examine a different set of social capital metrics. These are taken from Chetty et al. 
2022, which uses Facebook data to construct county-level measures of various social capital metrics. We 
focus on the three major measures of that paper — the volunteering rate (i.e. the fraction of people who 
participating in a volunteer organization), friendship clustering (the chances that, if A and B are friends, and 
C is friends with B, that A is also friends with C), and economic connectedness (the share of above median 
income friends by people with below median income). As these measures are at the county level, we 
compare them with county-level versions of RaceShare. We compare the baseline RaceShare variable in 
the same periods and counties that we have social capital measures, and then add the three social capital 


measures. We do this for all years (columns 1-2), 2011-2021 (columns 3-4) and 2021 only (columns 5-6). 
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Comparing the univariate RaceShare effect with the version with all three social capital measures 
included, the coefficient is reduced by 39%, from 0.318 to 0.200 in the full sample (with similar effects in 
other year ranges). Volunteering rates have directionally positive effects in all specifications, but are only 
significant when only using 2021 data. Friendship clustering and economic connectedness are negative but 
insignificant in sign. Overall, these results are consistent with social trust being a contributing factor to the 


racial diversity / fertility effect, with estimates ranging from 20-37% of the effect coming from this channel. 


3.10 Diversity Along Other Dimensions 

Next, we examine whether population shares have similar effects across demographic variables in 
general. This represents a robustness test for the possibility that RaceShare is just proxying for that person’s 
similarity to their local area along other, correlated dimensions. Second, it examines a related aspect of 
homophily- do people have higher birth rates if they are surrounded by people who are similar along other 
dimensions, or just primarily race? Conceptually, homophily does not require that the preference for 
similarity be equally strong on, or even present on, every dimension of possible similarity. Nonetheless, 
examining more variables helps distinguish between a version of homophily where race is just one example 
among many, or a version where race is one of the primary aspects of marital preferences. 

We construct analogous /Variable]Share variables for different demographic aspects of similarity. 
In Table 11 Panel A, we consider education, income decile, and age. Because shares depend on how coarsely 
or finely the groups are defined, coefficients are not directly comparable. For age, we calculate the share as 
the fraction of the population that is between two years younger and ten years older than the woman. We 
consider the effect of these variables under different fixed effects combinations as before. 

Panel A shows that EducationShare has a sign that is positive, but loses significance with additional 
controls. /ncomeDecileShare, in columns 5-8, shows a positive and statistically significant effect in all 
specifications, including with Area*Year fixed effects. Being of similar income to the people around you 


has the greatest similarity with race in its effect on fertility outcomes. The unconditional effect of a one 
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standard deviation increase in /ncomeDecileShare ranges from 0.024 more children in column 5 to 0.028 
more children in column 6. In columns 9-12, AgeShare shows inconsistent effects across specifications. 

In Panel B, we consider two other dimensions of similarity, namely the fraction of people that share 
the same status of being born in the US or not, and the fraction of people that share the same citizenship 
status. Both USBornShare and CitizenShare shows positive and significant effects in the first two columns, 
but insignificant negative effects once more controls are added. Overall, these effects show a similarly 
positive and consistent homophily effect for income, but weaker or inconsistent effects for the other 
variables. This suggests that race is not unique as a variable where homogeneity is associated with higher 
fertility, but neither do all important demographic variables have the same effect. 

In Panel C, we consider these variables together. We consider the same four specifications, showing 
coefficients in columns 1-4, and the effect of a one standard deviation unconditional change in each of the 
variables in columns 5-8. Importantly, RaceShare is positive, significant, and of a similar magnitude in all 
specifications. Adding in other area-level controls has a larger effect on the RaceShare coefficients relative 
to Table 2 Panel A in the early columns, because it represents a relatively greater addition of new control 
variables. However, with the Area* Year fixed effects in column 4, the RaceShare coefficient is 0.191, very 
close to the 0.197 in the equivalent specification in Table 2 Panel A column 11. 

The effect of RaceShare is about two and a half times as large as IncomeDecileShare, with effect 
sizes between 0.061 and 0.073 additional children, compared with 0.022 to 0.023 additional children for 
IncomeDecileShare. Finally, a number of the variables that show weak or inconsistent “univariate” effects 
(1.e., as the only /Variable]Share variable) show very different patterns after controlling for other aspects 
of similarity. CitizenShare is now large and economically significant, but USBornShare is now negative 
and significant, as is EducationShare. This suggests that if our main result is driven by homophily, then a 
number of variables show somewhat complicated effects, whereby a trait that is desirable at a univariate 
level may be undesirable once other correlated aspects of matches are controlled for. 

The fact that income share is the next most reliable measure is consistent with the considerable 


evidence for assortative matching based on income (Chiappori, Salanie and Weiss 2017, Greenwood et al. 
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2014, Fernandez et al. 2005, Schwartz and Mare 2005, and Chiappori et al. 2022). It is less obvious that 
this represents an explicit preference for homophily (i.e., it is not clear that lower income people explicitly 
prefer their partners to also have low income). Our result can also arise in matching models like Becker 
(1971), where everyone wants a richer partner, but has to be richer themselves to attract them. Even in this 
model (which lacks homophily), higher diversity may still lower marriage and birth rates, if the rich view 


their local low-income partner possibilities as being worse than the outside option of remaining single. 


3.11 Time Series Evidence 
Finally, we consider directly the original motivating question we began with — the time series changes in 
overall fertility and diversity. This aspect is somewhat implicit in the Table 2 specification with no controls, 
but we wish to test the overall question using simpler methods — how much of the overall change in fertility 
could plausibly be related to the increase in diversity and racial isolation?’ In Table 11 Panel A, our 
dependent variable is the total fertility rate for the US, from FRED, since 1961. We relate this to the average 
of the RaceShare variable in the year before (when most conception decisions would have been made). We 
include as controls various economic variables lagged by a year: inflation, GDP growth, and 
unemployment. In column 1, the univariate effect of average RaceShare is 1.440, with a t-statistic of 3.86. 
In terms of economic magnitude, there are two ways to think of this. First, the R-squared of the regression 
is 0.439, indicating that a substantial amount of year-to-year variation is explained. Secondly, we consider 
the full time-series change over the period (a drop in TFR of 0.602), versus the predicted change in TFR 
based on the changes in average RaceShare and the regression coefficient, and get a predicted change of 
0.393. That is, the variable explains 65.3% of the overall drop in fertility. 

Column 2 adds economic controls. The effect increases in both magnitude and significance, and 
now explains 117% of the overall decline. Because the level of geographic measurement varies over the 


sample, columns 3, 4 and 5 show the effect of average RaceShare measured for cities only, counties only, 


5 [n the time series tests, it is not possible to distinguish between the effects of national averages of RaceShare and 
RaceHerfindahl, as the time series correlation is 0.99. In other words, at the national level, higher average diversity 
and higher average racial isolation are extremely strongly linked, and thus their effects cannot be separated. 
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and states. The effect is insignificant for cities only, but significant for the remainder. Predicted changes are 
245.8%, 85.2% and 117.0% of actual declines respectively. Columns 6-10 limit the sample to 2006 onwards, 
where the continuous availability of annual data allows us to use Newey-West standard errors, here with a 
lag of five years. The effects are now larger in coefficient and significance during this period. The univariate 
R-squared in column 6 is now 88.6%. Predicted changes relative to actual amount to between 94.8% and 
115.3% of the actual changes. 

In Panel B, our dependent variable is the unadjusted average number of children for respondents. 
This is sensitive to other factors like the age profile of the women being sampled, but it has the advantage 
of being easy to construct back to 1850. In columns 1-4, we find that all geographic measures work as 
univariate predictors since 1850 with ¢-statistics above 6. R-squared values range from 0.571 to 0.796, and 
predicted changes as a fraction of actual are 85.8%, 70.4%, and 83.6% for base, city and state respectively 
(with county being an odd outlier at explaining 3892%, partly due to the smaller number of observations). 

Finally, in columns 5-8, we repeat the analysis using only decennial observations to mitigate the 
potential influence of the numerous annual observations available since 2000. The results obtained from 
this restricted sample are consistent with our previous findings. 

It goes without saying that with aggregate time series changes it is hard to say for sure what their 
drivers are, and the ability to make causal statements is very limited. Nonetheless, to the extent that one 
believes in a potential causal channel from the tighter cross-sectional tests already discussed, these tests 
serve to show that the magnitudes of the time series changes are considerable, and that changes in diversity 


may be important variables for helping quantitatively explain the decline in birth rates that we observe. 


4. Conclusion 


In this paper, we document a new and important stylized fact linking the central demographic 
changes of our time. Women living in areas of higher racial diversity robustly have fewer children. We do 
not explicitly argue that this represents a causal relationship, but the obvious non-causal explanations have 


considerable difficulty explaining the range of facts we document. The effect is present in every period that 
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U.S. Census data is easily obtainable, so it is not an artefact of modern race relations. It is present for many 
different races of women, so is not just related to black/white race relations. It holds (unevenly) in other 
countries, so while it is not an inevitable human universal, it is also not limited to the U.S. Diversity is not 
only associated with the direct costs of raising children, but also other relationship outcomes like the 
likelihood of getting married, and the age at which marriage occurs. The findings suggest that the impact 
of racial diversity extends beyond the narrow scope of childrearing expenses and influences multiple 
aspects of family formation and stability. 

What alternatives are left that fit all the facts above? The strongest of these is preferences for 
homophily in partner choice, and we present evidence specifically consistent with this, from differences in 
interracial marriage across races, and between the sexes within a race. These additional results are hard to 
explain under competing theories. More speculative, but potentially also important, is the role of social 
trust. Putnam (2007) links this to racial concentration, the more direct measure of diversity. Our results also 
show a negative relationship between racial concentration and birth rates, and this generally holds 
controlling for race share. The relationship between racial isolation specifically, and what these other 


aspects of racial concentration are capturing, is an important avenue for future studies. 
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Table 1 — Summary Statistics 


This table presents summary statistics for the main variables used in the paper. Data is taken from U.S. Census and American Community Survey files from 1850 
to 2021, obtained from IPUMS, for all women aged 18-40 at the time of survey. Number of Children is the number of children the woman has living at home at the 
time of the survey. Race Share is the fraction of the local area population of the same race as the women, where race is the nine broad racial groups classified by 
the census, plus a tenth for Hispanic/Latino. Race Herfindahl is the sum of squared percentages for each racial group in that local area. Panel B presents breakdowns 
by year. Local areas are defined as being first city (if available), then county (if available), then detailed metro area. The number of local areas and total respondents 
is shown, along with race shares for the ten groups in that year. A blank value means that classification was not collected at the time. 


Panel A - Whole Sample 


N Mean Std Dev Min 25th Pct 50th Pct 75th Pct Max 
Number of Children 7,156,888 0.98 1.30 0 0 0 2 9 
Age 7,156,888 28.93 6.68 18 23 29 35 40 

Race Share 7,156,888 0:557 0.324 0.000 0.231 0.635 0.855 1.000 

Race Herfindahl 7,156,888 0.587 0.212 0.177 0.411 0.572 0.763 1.000 
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Year 
1850 
1860 
1870 
1880 
1900 
1910 
1920 
1930 
1940 
1950 
1970 
1980 
1990 
2000 
2003 
2005 
2006 
2007 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
2015 
2016 
2017 
2018 
2019 
2020 
2021 


N 
7,418 
12,548 
18,509 
26,751 
60,426 
85,664 
98,633 

142,954 

140,756 

192,521 

403,112 

320,561 

332,815 

241,894 
20,136 

312,914 

322,868 

324,831 

322,150 

325,889 

328,152 

327,101 

274,379 

280,088 

278,666 

281,817 

282,583 

288,755 

290,629 

287,380 

234,678 

289,310 


# Areas 
72 
105 
176 
259 
513 
693 
398 
1,111 
234 
259 
187 
511 
565 
159 
61 
651 
650 
650 
650 
650 
650 
650 
504 
504 
504 
504 
504 
504 
504 
504 
504 
504 


Race Share Race Share 


Mn 
0.917 
0.938 
0.891 
0.885 
0.894 
0.886 
0.871 
0.842 
0.834 
0.777 
0.716 
0.687 
0.647 
0.486 
0.572 
0.555 
0.542 
0.537 
0.529 
0.522 
0.503 
0.500 
0.473 
0.474 
0.468 
0.466 
0.464 
0.461 
0.462 
0.463 
0.440 
0.428 


SD 
0.191 
0.168 
0.212 
0.217 
0.209 
0.216 
0.227 
0.246 
0.251 
0.274 
0.301 
0.309 
0.316 
0.297 
0.323 
0.323 
0.319 
0.318 
0.317 
0.315 
0.301 
0.307 
0.300 
0.300 
0.299 
0.298 
0.297 
0.297 
0.297 
0.299 
0.292 
0.286 


White 
0.9520 
0.9656 
0.9240 
0.9187 
0.9243 
0.9182 
0.9141 
0.8919 
0.8904 
0.8494 
0.7961 
0.7591 
0.7182 
0.5609 
0.6345 
0.6213 
0.6083 
0.6037 
0.5959 
0.5887 
0.5746 
0.5620 
0.5329 
0.5390 
0.5336 
0.5326 
0.5326 
0.5321 
0.5348 
0.5388 
0.5181 
0.5008 


Panel B - By Year 


Black 

0.0453 
0.0308 
0.0715 
0.0750 
0.0701 
0.0739 
0.0740 
0.0889 
0.0908 
0.1169 
0.1290 
0.1334 
0.1234 
0.1504 
0.0777 
0.1167 
0.1210 
0.1193 
0.1205 
0.1220 
0.1254 
0.1329 
0.1304 
0.1259 
0.1253 
0.1222 
0.1183 
0.1142 
0.1111 
0.1069 
0.1059 
0.1039 


Amer. 
Indian 


0.0002 
0.0001 
0.0000 
0.0001 
0.0001 
0.0002 
0.0004 
0.0003 
0.0004 
0.0022 
0.0047 
0.0058 
0.0049 
0.0067 
0.0051 
0.0052 
0.0049 
0.0050 
0.0048 
0.0051 
0.0060 
0.0050 
0.0049 
0.0046 
0.0046 
0.0043 
0.0044 
0.0045 
0.0044 
0.0042 
0.0041 


Chinese 


0.0002 
0.0002 
0.0009 
0.0002 
0.0004 
0.0003 
0.0005 
0.0005 
0.0010 
0.0037 
0.0054 
0.0099 
0.0155 
0.0089 
0.0149 
0.0158 
0.0158 
0.0159 
0.0163 
0.0169 
0.0181 
0.0206 
0.0209 
0.0223 
0.0230 
0.0240 
0.0244 
0.0245 
0.0257 
0.0271 
0.0263 


Japanese 


0.0000 
0.0008 
0.0017 
0.0019 
0.0010 
0.0017 
0.0052 
0.0040 
0.0045 
0.0042 
0.0114 
0.0040 
0.0036 
0.0037 
0.0033 
0.0033 
0.0030 
0.0030 
0.0033 
0.0034 
0.0030 
0.0029 
0.0028 
0.0028 
0.0028 
0.0026 
0.0025 
0.0025 


Asia Pac. 


0.0002 
0.0005 
0.0005 
0.0002 
0.0003 
0.0048 
0.0137 
0.0257 
0.0458 
0.0502 
0.0483 
0.0496 
0.0517 
0.0529 
0.0540 
0.0555 
0.0546 
0.0614 
0.0618 
0.0637 
0.0637 
0.0647 
0.0674 
0.0668 
0.0680 
0.0703 
0.0734 


Other 


0.0001 
0.0013 
0.0011 
0.0009 
0.0022 
0.0026 
0.0031 
0.0030 
0.0030 
0.0028 
0.0027 
0.0023 
0.0022 
0.0024 
0.0026 
0.0026 
0.0025 
0.0028 
0.0029 
0.0030 
0.0029 
0.0047 
0.0058 


Two Races 


0.0192 
0.0162 
0.0125 
0.0134 
0.0146 
0.0159 
0.0170 
0.0188 
0.0201 
0.0208 
0.0216 
0.0226 
0.0230 
0.0238 
0.0248 
0.0252 
0.0262 
0.0405 
0.0438 


Three 
Races 


0.0014 
0.0037 
0.0011 
0.0012 
0.0012 
0.0016 
0.0015 
0.0018 
0.0023 
0.0031 
0.0035 
0.0036 
0.0034 
0.0038 
0.0038 
0.0039 
0.0037 
0.0052 
0.0052 


Hispanic 
0.0027 
0.0033 
0.0042 
0.0054 
0.0053 
0.0063 
0.0092 
0.0160 
0.0168 
0.0301 
0.0578 
0.0784 
0.1118 
0.1954 
0.1880 
0.1731 
0.1789 
0.1822 
0.1863 
0.1898 
0.1966 
0.1989 
0.2200 
0.2165 
0.2187 
0.2220 
0.2229 
0.2231 
0.2235 
0.2208 
0.2215 
0.2344 
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Table 2 — Number of Children and Racial Diversity 


This Table presents the baseline relationship between the number of children a woman has and various measures of local levels of racial diversity. Data is taken 
from the U.S. decennial census from 1850 to 2000, and from the American Community Survey from 2001 to 2021. Observations are taken for women ages 18-40 
at the time of survey who have non-missing geographic information for either city, county, or detailed metro area. The dependent variable is the number of children 
the woman has. In Panel A, the main independent variable is Race Share, the fraction of the population in the local area who are of the same race/ethnicity as the 
woman (“race”, as a shorthand). Race is constructed as ten categories, with nine categories for the broad racial groups (if the respondent is not Hispanic or Latino) 
and a tenth category for Hispanic/Latino. Local area is measured first as county (if present), then city (if county is missing), then metro area (if both city and county 
are missing). Fixed effects are included as labeled for race, state, year, state by year, demographics (age, marital status, education, race, employment status, income 
decile, and citizenship), demographics by state and year, area type (1.e., county, city or metro), deciles of population within that area type, deciles of population by 
year, area (where area is measured at the same level as the race share), and area by year. Area parametric controls are included as average income decile, average 
age, fraction employed, and a z-score for the fraction of residents who moved in the last 1 or 5 years (depending on data availability). These are also interacted with 
year fixed effects. The earliest year for data availability (given the set of controls) is noted. The effect on the number of children of a one standard deviation change 
in race share is indicated, both for an unconditional one standard deviation change across all observations, and a conditional standard deviation — a one standard 
deviation change in the residual after first regressing race share on the set of fixed effects in the regression. In Panel B, the race share variable is replaced with a 
race Herfindahl index for that area and year. In Panel C, both the race share and race Herfindahl index are included. In Panel D, the Herfindahl Index is computed 
only among races other than the respondent's own race (so it measures concentration among the races other than your own). Standard errors are double clustered 
by year and state. Coefficients are in the top row, and t-statistics are below in parentheses, with *, ** and *** indicating statistical significance at the 10%, 5% and 
1% level respectively. 
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Race Share 


Effect of 1 o change 
(unconditional) 


Effect of 1 o change (conditional) 


Race 

State 

Year 

State-Y ear FE 

Demographics FE 
Demographics*(State, Year) FE 
Area Type 

Area Traits 

Area Traits * Year FE 

Area Type * Population FE 
Area Type * Population FE *Year 
Area FE 

Area * Year FE 

Earliest Year 

Observations 

R-squared 


0.159*** 
(2.84) 


0.052 


0.052 


N 


ZZ2AAAAANAZL 


zzz 


N 
1850 


0.708*** 
(6.89) 


0.230 


0.123 


Y 


ZZAZAAAAAAALAZ 


N 
1850 


0.435*** 
(6.92) 


0.141 


0.067 


ZZA2ZAAAAAKK «K 


Z-Z 


N 
1850 


Panel A - Race Share 


Dependent variable is number of children at time of survey 


0.310% ** 
(7.30) 


0.101 


0.049 


ZKKZ 


ZZZZZZZ 


N 
1980 


0.241**# 
(8.58) 


0.078 


0.038 


ZK K Z2ZZ 


uuu 


N 
1980 


0.291 *** 
(7.03) 


0.094 


0.037 


N 


ZZ2Z2AK2ZA<X ZZ 


zzz 


N 
1980 


0.204*** 
(6.58) 


0.066 


0.026 


zZ 


ZZZZKKKZKZZ 


N 
1980 


0.160*** 
(5.00) 


0.052 


0.019 


N 


ZK AK AK AK ZZ 


Z 


N 
1980 


0.166*** 
(5.26) 


0.054 


0.020 


N 


ZK ZX 22K 2<*% ZZ 


N 
1980 


0.194 
(5.89) 


0.063 


0.020 


N 


ZX 2K ZZ 


Z 


ZK 


KK 


N 
1980 


0.197*** 


(5.88) 


0.064 


0.020 


N 


ZX 2A2ZAZ 


2 Z 2 Z 


Z 


Y 
1980 


7,156,888 7,156,888 7,156,887 5,967,596 5,967,596 5,967,585 5,967,585 5,967,585 5,967,585 5,967,585 5,967,585 


0.002 


0.017 


0.041 


0.313 


0.392 


0.400 


0.402 


0.402 


0.402 


0.404 


0.405 
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Race Herfindahl 


Effect of 1 o change 
(unconditional) 


Effect of 1 o change (conditional) 


Race 

State-Y ear FE 

Demographics FE 
Demographics*(State, Year) FE 
Area Type 

Area Traits 

Area Traits * Year FE 

Area Type * Population FE *Y ear 
Area FE 

Earliest Y ear 

Observations 

R-squared 


Race Share 


Race Herfindahl 


Race 

State-Y ear FE 

Demographics FE 
Demographics*(State, Year) FE 
Area Type 

Area Traits 

Area Traits * Year FE 

Area Type * Population FE *Y ear 
Area FE 

Earliest Y ear 

Observations 

R-squared 


0.590*** 
(5.33) 


0.125 


0.125 


N 


ZZZZZ ZZ 


N 
1850 
7,156,888 
0.009 


Panel B - Race Herfindahl 
Dependent variable is number of children at time of survey 


0.819*** 
(7.76) 


0.174 


0.156 


Y 


Z A ZZ ZA Z Z 


N 
1850 
7,156,888 
0.022 


0.664*** 
(15.02) 


0.141 


0.083 


1980 
7,156,887 
0.042 


0.342*** 
(14.95) 


0.073 


0.044 


1980 
5,967,596 
0.393 


Panel C - Race Share and Race Herfindahl 


-0.162*** 
(-5.02) 
0.751*** 
(6.05) 
N 


ZZZZZZZ 


N 
1850 
7,156,888 
0.010 


0.262*** 
(4.65) 
0.674*** 
(6.15) 
Y 


Wow ow Z uu Z 


N 
1850 
7,156,888 
0.023 


0.236*** 
(4.81) 
0.530*** 
(11.57) 


ZZ2ZAAAAKK 


N 
1980 
7,156,887 
0.043 


0.143*** 
(4.68) 
0.261*** 
(9.13) 
N 


ZAAAAKK 


N 
1980 
5,967,596 
0.393 


0.207*** 
(8.76) 


0.044 


0.023 


Z 


ZZK Z 


N 
1980 
5,967,585 
0.402 


0.157*** 
(4.24) 
0.112*** 
(3.56) 
N 


ZZK Z 


N 
1980 
5,967,585 
0.402 


0.121*** 
(3.91) 


0.026 


0.011 


N 


KKZZKZK 


Z 


1980 
5,967,585 
0.402 


0.159*** 
(4.54) 
0.023 
(0.62) 

N 


KK ZZ Ze 


N 
1980 
5.967,585 
0.402 


0.037*** 


(4.16) 


0.008 


0.001 


N 


KK ZZ Ze 


Y 
1980 


5.967,585 


0.404 


0.197 *** 


(5.85) 


-0.080*** 


(-3.48) 
N 


KK ZZ AK 


Y 
1980 


5,967,585 


0.404 
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Panel D - Race Share and Race Herfindahl Across Other Races 
0.203 *** 


Race Share 


Other Race Herfindahl 


Race 

State-Y ear FE 

Demographics FE 
Demographics*(State, Year) FE 
Area Type 

Area Traits 

Area Traits * Year FE 

Area Type * Population FE *Y ear 
Area FE 

Earliest Y ear 

Observations 


0.244*** 
(4.42) 
0.69 
(6.31) 
N 


ZZZ ZZ ZZ 


N 
1850 
7,114,973 
0.014 


0.694*** 
(7.17) 
0.569*** 
(5.28) 


Y 


ZZZ ZZ ZZ 


N 


1850 
7,114,973 
0.024 


0.528*** 
(6.88) 
0.310*** 
(3.48) 
Y 


ZZZZZ ZK 


N 
1980 
7,114,972 
0.042 


0.319*** 
(7.57) 
0.235*** 
(4.81) 
N 


Z2Z2AZA2AAKK 


N 
1980 
5,967,596 
0.393 


0.241*** 
(7.17) 
0.094*** 
(3.83) 
N 


ZZ<<< Zz 


Z 


1980 
5,967,585 
0.402 


0.182*** 
(5.61) 
0.036* 
(2.05) 

N 


KK AAK AK 


N 
1980 
5,967,585 
0.402 


(5.37) 

0.014 

(0.66) 
N 


KK ZAK AK 


Y 
1980 


5,967,585 


0.404 


R-squared 
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Table 3 — Alternative Constructions of Diversity Variable 


This Table presents alternative versions of the main regressions in Table 2. Panel A considers alternative definitions 
of race. This includes i) omitting the Hispanic/Latino category ii) detailed race measures but omitting the 
Hispanic/Latino category ii) race interacted with Hispanic/Latino, iv) detailed race measures interacted with 
Hispanic/Latino v) using ancestry instead of race and ethnicity, and vi) using the baseline measure (broad race plus a 
Hispanic category) only for the population aged 18 and older). Panel B varies the geographic region race share is 
measured at, including 1) city only, i1) county only, iii) metro area only, iv) State, and v) a combined measure in the 
different order, namely city first, then county, then metro area. Panel C explores different weighting schemes, including 
i) weighting each area by year equally, ii) weighting each year equally, iii) using census household weights, and iv) 
using census household weights when constructing the race share variable. Standard errors are double clustered by 
year and state. Coefficients are in the top row, and t-statistics are below in parentheses, with *, ** and *** indicating 
statistical significance at the 10%, 5% and 1% level respectively. 


Panel A - Different Race Definitions 
Race Share (No Hispanic 0.183*** 
Category) (7.91) 


Race Share (Detailed Race, 0.188*** 
No Hispanic Category) (8.37) 
Race Share (Hispanic 0.209*** 
Interaction) (6.15) 
Race Share (Detailed Race, 0.223 *** 
Hispanic Interaction) (6.89) 
kkk 
Ancestry Share Ra 
(6.02) 
Race Share (Only 18+ 0.197*** 
population) (5.88) 
Demographics*(State, Y ear) Y Y Y Y Y Y 
Area * Year Y Y Y Y Y Y 
Observations 5,967,586 5,966,368 5,967,568 5,962,421 — 5,161,861 5,967,585 
R-squared 0.403 0.405 0.405 0.409 0.413 0.405 


46 


Panel B - Different Region Measures 


-0.236** 
(-2.42) 
0.215*** 
(6.43) 
N 
Y 
1,531,597 
0.392 


0.193*** 
(6.55) 
Year 
(HH 
Weight) 
Y 
Y 
5,959,683 
0.393 


Race Share (City) Gee: 
(5.99) 
Race Share (County) ail 
(7.66) 
Race Share (Metro Area) Qut 
(3.28) 
Race Share (City, then County, 0.286*** 
then Metro) (6.54) 
Race Share (State) Quot 
(1.86) 
Race Share (Baseline - County, 
then City, then Metro) 
State* Y ear Y Y Y Y N 
Demographics*(State, Y ear) Y Y Y Y Y 
Observations 1,531,597 — 5,008,007 — 3,127,919 — 5,967,585 9,267,466 
R-squared 0.393 0.401 0.394 0.400 0.392 
Panel C - Weighting 
Race Share (Baseline - Unweighted) 0.210*** ^ 0.197*** 0.190*** 
(6.47) (5.87) (6.22) 
Race Share (HH Weighted) 0.208*** 0.199*** 
(6.88) (6.17) 
Year 
Sample Weighting Area*Y ear Year (HH Area* Year Year 
(Indiv) (Indiv) Weight) (Indiv) (Indiv) 
Demographics*(State, Y ear) FE Y Y Y Y Y 
Area * Year FE Y Y Y Y Y 
Observations 5,967,585 5,967,585 5,959,683 5,967,585 5,967,585 
R-squared | 0.402 0.406 0.393 0.402 0.406 
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Table 4 — Mobility 


This Table examines whether the main effects are due to selection effects based on mobility. We conduct similar 
versions of the main regressions in Table 2, but limit the sample to various categories of women less likely to have 
relocated: 1) those living in the state they were born in, 11) those who haven't moved in the past year, 111) those who 
haven’t moved in the past 5 years, and combinations of these. Standard errors are double clustered by year and state. 
Coefficients are in the top row, and t-statistics are below in parentheses, with *, ** and *** indicating statistical 
significance at the 10%, 5% and 1% level respectively. 


Race Share 0.161*** — 0200*** — 0.124*** — Q.191*** — Q.190*** 
(5.05) (5.13) (3.87) (5.11) (5.42) 


Selection Livingin No Movein NoMovein NoMovein Any of 
State of Last Year LastFive  LastOneor Previous 


Birth Years Five Years 
Demographics *(State, Year) FE Y Y Y Y Y 
Area * Year FE Y Y Y Y Y 
Clustering State, Year State, Year State State, Year State, Year 
Observations 3,211,983 3,861,439 423,389 4,284,878 5,100,548 
R-squared 0.405 0.406 0.457 0.412 0.407 


48 


Table 5 — Different Time Periods 


This Table examines how racial diversity is associated with number of children in different time periods of US history. 
Specifications from Table 2 are run separately for 1) 1850-1860, ii) 1870-1890, iii) 1900-1940, 1950-1970, 1980-1990, 
and 2000-2021. Panel A includes state by year fixed effects as well as age and race both interacted with state and year. 
Panel B also includes area by year fixed effects. Coefficients are in the top row, and f-statistics are below in 
parentheses, with *, ** and *** indicating statistical significance at the 10%, 5% and 1% level respectively. 


Panel A - No Area*Y ear Fixed Effects 
Race Share — 1.889* 1.015*** 0.949*** 1.054*** 0.552*** 0.515*** 
(2.01) (2.83) (3.19) (3.91) (6.37) (5.92) 
Period 1850-1860 1870-1890 1900-1940 1950-1970 1980-1990 2000-2021 
Effect of 1 o change 


2 0.335 0.215 0.227 0.310 0.173 0.158 
(unconditional) 
State* Y ear Y Y Y Y Y Y 
(Age, Race)*(State, Year) FE Y Y Y Y Y Y 
Area*Year FE N N N N N N 


Observations 19,929 105,637 467,974 595,615 653,356 5,314,218 


R-squared 0.302 0.252 0.181 0.282 0.261 0.269 


Panel B - With Area* Year Fixed Effects 
Race Share -1.162 0.049 0.069 0.355* 0.338*** 0.321*** 
(-1.23) (0.32) (0.62) (1.93) (5.35) (5.60) 
Period 1850-1860 1870-1890 1900-1940 1950-1970 1980-1990 2000-2021 
Effect of 1 o change 


» -0.206 0.010 0.017 0.104 0.106 0.098 
(unconditional) 
State* Y ear Y Y Y Y Y Y 
(Age, Race)*(State, Year) FE Y Y Y Y Y Y 
Area*Year FE Y Y Y Y Y Y 
Observations 19,929 105,635 467,969 595,615 653,356 5,314,218 
R-squared 0.311 0.267 0.198 0.289 0.273 0.283 
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Table 6 — Effects by Race 


This Table examines how the effect of racial diversity on number of children varies with the race of the woman. 
Observations are taken for women ages 18-40, in US census surveys from 1850 to 2021. Specifications from Table 2 
are run with interactions between the baseline race share variable, and then ten racial ethnic groups we consider (from 
census categories): white, black, native American, Chinese, Japanese, Asian/Pacific Islander, other, two races, three 
or more races, and Hispanic. Controls are the same as in Table 2. Coefficients are in the top row, and t-statistics are 
below in parentheses, with *, ** and *** indicating statistical significance at the 10%, 5% and 1% level respectively. 


Race Share * White 

Race Share * Black 

Race Share * Native American 

Race Share * Chinese 

Race Share * Japanese 

Race Share * Asian / Pacific Islander 
Race Share * Other 

Race Share * Two Races 

Race Share * Three or More Races 
Race Share * Hispanic 


Race 

State-Year FE 

Demographics FE 
Demographics*(State, Year) FE 
Area Type * Population FE 
Area Type * Population FE *Y ear 
Area Traits 

Area Traits * (State, Year) FE 
Area FE 

Area * Year FE 

Earliest Y ear 

Observations 

R-squared 


1.000*** 
(7.38) 
0.084 
(0.76) 

0.474*** 
(3.63) 

-0.506** 
(-2.18) 

0.584** 
(2.53) 
-0.259 
(-0.90) 
0.710 
(0.20) 
0.554 
(0.47) 

2.535 
(5.44) 
0.122 
(1.38) 

Y 


ZZZZZZZZ 


N 
1850 


0.703*** 
(7.46) 
0.111 
(1.26) 

0.436*** 
(4.81) 
-0.015 
(-0.07) 

-0.855*** 
(-5.18) 
-0.166 
(-1.12) 

15.308*** 
(3.05) 
0.854 
(1.60) 
1.122* 
(1.75) 
0.093 
(1.13) 


ZAAAALAAAKK 


1850 


0.327*** 
(8.62) 
0.136** 
(2.22) 
0.705*** 
(9.99) 
-0.067 
(-0.63) 
0.156* 
(2.06) 
0.053 
(0.35) 
4.475 
(1.39) 
0.257 
(0.60) 
1.38] ** 
(4.40) 
0.137* 
(2.03) 


ZAAZAAAAKK Z 


1850 


0.414*** 
(9.74) 
0.070** 
(2.50) 
0.474*** 
(4.56) 
-0.077 
(-0.39) 
-0.025 
(-0.06) 
-0.110 
(-0.83) 
3.121 
(0.95) 
1.692** 
(2.19) 
-0.435 
(-0.50) 
0.137* 
(1.73) 


ZAAAAKK AK Z 


1980 


0.263***  0264*** 
(6.18) (5.90) 
0.131***  0.137*** 
(3.33) (3.75) 
0.167  0.188* 
(1.67) (1.79) 
0.673*** — 0,704 
(4.09) (4.24) 
0.630 0.681 
(1.35) (1.47) 
0.171 0.182 
(1.36) (1.42) 
3.540 3.700 
(1.05) (1.13) 
2.723*** 2,8799 
(4.20) (4.33) 
-0.648 -0.626 
(0.67)  (-0.64) 
-0.013  -0.000 
(0.24) — (-0.00) 
N N 
Y Y 
N N 
Y Y 
Y N 
N Y 
Y N 
N Y 
N N 
N N 
1980 1980 


0.185*** 


(3.70) 


0.272*** 


(4.24) 
0.252* 
(1.90) 


0.957*** 


(7.46) 
1.114** 
(2.26) 
0.141 
(0.83) 
5.838* 
(2.07) 


2.011*** 


(4.34) 
-1.003 
(-1.11) 
0.158 
(1.56) 


K2AA7AAAAK AYA? 


1980 


7,156,888 7,156,887 5,967,596 5,967,585 5,967,585 5,967,585 5,967,585 


0.020 


0.042 


0.392 


0.400 


0.402 0.403 


0.405 
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Table 7 — International Results 


This Table examines the relationship between local racial diversity and the number of children a woman has, for different countries around the globe. All countries 
with census data on IPUMS that contain information on race are included, Observations are taken for all women age 18-40 at the time of the survey. The dependent 
variable is the number of children the woman has. The independent variable is the fraction of the local area population of the same race as the woman. Geography 
is measured at the finest level available (usually “level 2” on IPUMS, generally corresponding to regions within a state, but sometimes “level 1”, generally 
corresponding to a state, if there is no level 2 information). Race is measured according to whatever definition is used in the country in question. Controls are 
included for level 1 by year (colloquially, “state-year”), demographics, demographics by state and year, log population density by year, and local region (i.e., level 
2) by year. Demographics variables include whichever is available for that country, out of urban status, marital status, race, employment status, age, and educational 
attainment. Full country-level information on race definitions and controls is included in the Data Appendix. Panel A examines the United Kingdom and three 
countries from Africa — Mozambique, South Africa, and Zimbabwe. Panel B examines countries from Central America — Costa Rica, Cuba, El Salvador and 
Jamaica. Panel C examines countries from South America — Brazil, Colombia, Ecuador and Uruguay. Coefficients are in the top row, and t-statistics are below in 
parentheses, with *, ** and *** indicating statistical significance at the 10%, 5% and 1% level respectively. 


Panel A - UK and Africa 
Country UK Mozambique South Africa Zimbabwe 
Race Share 0.528*** 0.141 —0.501*** 0.664*** -0.006  0.158***  0.323** .0.319*** 3.360*** 3.887*** -0.560 
(5.05) (0.87) (3.40) (6.46) (-0.07) (3.28) (2.81) (7.49) (2.67) (3.62) (-0.09) 


State-Year FE N Y Y Y N Y Y N Y Y N 
Demographics FE Y Y Y N N Y N N Y N N 
Demographics* Year FE N N N Y Y N Y Y N N N 
Demographics*State FE N N N Y Y N Y Y N Y Y 
Ln Population Density* Year Y N N Y Y N Y Y N Y Y 
Local Region*Year N N N N Y N N Y N N Y 
Number of Years 1 1 2 2 2 4 4 4 1 1 1 

Race Share Level State State Local Local Local Local Local Local Local Local Local 

Clustering State State Local Local Local Local Local Local Local Local Local 


Observations 92,397 92,397 638,099 638,096 638,096 1,797,315 1,797,315 1,797,315 122,944 122,938 122,938 
R-squared 0.439 0.440 0.286 0.299 0.305 0.293 0.301 0.302 0.340 0.349 0.353 
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Panel B - Central America 
Country Costa Rica Costa Rica Costa Rica Cuba Cuba Cuba El SalvadorEl SalvadorEl Salvador Jamaica Jamaica Jamaica 
Race Share -0.158 -0.212* . 0.270* 0.012 -0.076* | 0.064**  0.308*** 0.319*** . 0.026 0.241 0.241 -0.045 
(-1.26) (-1.77) (2.00) (0.80) (-1.85) (2.23) (3.50) (3.72) (0.36) (0.42) (0.42) (-0.08) 


State-Year FE Y Y N Y Y N Y Y N Y Y N 
Demographics FE Y N N Y N N Y N N Y N N 
Demographics*Year FE N Y Y N Y Y N N N N Y Y 
Demographics*State FE N Y Y N Y Y N Y Y N Y Y 
Ln Population Density *Year N Y Y N Y Y N Y Y N Y Y 
Local Region*Year N N Y N N Y N N Y N N Y 
Number of Years 2 2 2 2 2 2 1 1 1 3 3 3 

Race Share Level Local Local Local Local Local Local Local Local Local State State State 

Clustering Local Local Local Local Local Local Local Local Local State State State 


Observations 148,777 148,777 148,777 383,755 383,755 383,755 108,368 108,364 108,364 . 39,723 39,723 39,723 
R-squared 0.432 0.444 0.446 0.268 0.273 0.275 0.405 0.414 0.417 0.301 0.301 0.303 


Panel C - South America 
Country Brazil Brazil Brazil Colombia Colombia Colombia Ecuador Ecuador Ecuador Uruguay Uruguay Uruguay 
Race Share -0.038*** -0.056*** -0.015** 0.017 -0.027 0.012 0.200** . 0.108* 0.016 | -1.311*** -1.869*** ].056*** 
(-3.70) (-4.33) (-2.05) (0.28) (-0.77) (0.40) (2.45) (1.81) (0.39) (-5.08) (-8.35) (2.81) 


State-Year FE Y Y N Y Y N Y Y N DA Y N 
Demographics FE N N 

Demographics*Year FE N Y Y N N N N Y Y N Y N 
Demographics*State FE N Y Y N Y Y N Y Y N Y Y 
Ln Population Density *Year N Y Y N Y Y N Y Y N Y Y 
Local Region*Year N N Y N N Y N N Y N N Y 
Number of Years 4 4 4 1 1 1 2 2 2 1 1 1 

Race Share Level Local Local Local Local Local Local Local Local Local Local Local Local 

Clustering Local Local Local Local Local Local Local Local Local Local Local Local 


Observations 16,032,064 16,032,064 16,032,064 678,567 678,566 678,566 492,708 492,706 492,706 52,118 52,103 52,103 
R-squared 0.440 0.469 0.473 0.385 0.395 0.399 0.394 0.402 0.403 0.386 0.399 0.402 
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Table 8 — Racial Diversity, Marriage and Divorce 


This Table examines how local levels of racial diversity affect outcomes related to marriage and divorce. Observations 
are taken for women ages 18-40, in US census surveys from 1850 to 2021. The main independent variable is race 
share — the fraction of the local population that is the same racial/ethnic group as the woman. Controls are the same 
as those in Table 2. In Panel A, the dependent variable is a dummy equal to one if the woman is currently married, 
and zero otherwise. In Panel B, the dependent variable is a dummy equal to one if the woman ever married (that is, if 
she is either currently married, widowed, or divorced), and zero otherwise. In Panel C, the sample is limited to women 
who got married, and the dependent variable is a dummy equal to one if there are currently divorced. In Panel D, the 
same is limited to women who are currently married, and on their first marriage. The dependent variable is the age at 
which they got married. Coefficients are in the top row, and t-statistics are below in parentheses, with *, ** and *** 
indicating statistical significance at the 10%, 5% and 1% level respectively. 


Panel A - Currently Married 
Race Share 0.192*** 0.046** 0.081***  0.112***  0.040***  0.042***  0,036*** 
(4.57) (2.25) (4.02) (4.39) (3.44) (3.67) (2.95) 


Effectof 1 o change (5c 0.015 0.026 0.036 0.013 0.014 0.012 


(unconditional) 
Race Y Y N N N N N 
State-Y ear FE N Y Y Y Y Y N 
Demographics FE N N Y N N N N 
Demographics*(State, Year) FE N N N Y Y Y Y 
Area Type * Population FE N N N Y Y N N 
Area Type * Population FE *Y ear N N N N N Y N 
Area Traits N N N N Y N N 
Area Traits * (State, Year) FE N N N N N Y N 
Local Area * Year FE N N N N N N Y 


Earliest Year 1850 1850 1850 1980 1980 1980 1980 
Observations 7,118,413 7,118,412 5,967,596 5,967,587 5,967,587 5,967,587 5,967,587 
R-squared — 0.030 0.067 0.298 0.310 0.314 0.315 0.319 


Panel B - Ever Married 

Race Share 0.191*** — 0.050**  0.074***  0.119***  0.041***  0.043*** 0.036*** 

(5.11) (2.57) (3.76) (4.54) (4.14) (4.54) (3.36) 
Effect of 1 o change 
(unconditional) 
Race 
State-Y ear FE 
Demographics FE 
Demographics*(State, Year) FE 
Area Type * Population FE 
Area Type * Population FE *Year 
Area Traits 
Area Traits * (State, Year) FE 
Local Area * Year FE 

Earliest Year — 1850 1850 1850 1980 1980 1980 1980 
Observations 7,118,413 7,118,412 5,967,596 5,967,587 5,967,587 5,967,587 5,967,587 

R-squared 0.023 0.068 0.376 0.387 0.392 0.392 0.396 


0.062 0.016 0.024 0.039 0.013 0.014 0.012 
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Panel C - Age at First Marriage (Given Currently Married, Married Only Once) 


Race Share 


Effect of 1 o change 
(unconditional) in Months 

Race 

State- Year FE 

Demographics FE 
Demographics*(State, Year) FE 
Area Type * Population FE 
Area Type * Population FE *Y ear 
Area Traits 

Area Traits * (State, Year) FE 
Local Area * Year FE 

Earliest Year 

Observations 

R-squared 


Race Share 


Effect of 1 o change 
(unconditional) 

Race 

State-Y ear FE 

Demographics FE 
Demographics*(State, Year) FE 
Area Type * Population FE 
Area Type * Population FE *Year 
Area Traits 

Area Traits * (State, Year) FE 
Local Area * Year FE 

Earliest Year 

Observations 

R-squared 


-1.790*** .1.382*** .].124*** .].443*** .0.686*** -0.691*** .-0.627*** 
(-5.96) (-7.61) (-8.87) (-6.98) (-5.24) (-5.34) (-5.48) 
-6.6 -5.1 -4.1 -5.3 -2.5 -2.5 -2.3 
Y Y N N N N N 
N Y Y Y Y Y N 
N N Y N N N N 
N N N Y Y Y Y 
N N N Y Y N N 
N N N N N Y N 
N N N N Y N N 
N N N N N Y N 
N N N N N N Y 
1850 1850 1850 1980 1980 1980 1980 
1,730,380 1,730,380 1,730,380 1,730,368 1,730,368 1,730,368 1,730,368 
0.026 0.054 0.234 0.243 0.247 0.248 0.254 

Panel D - Divorced, Given Married 

-0.060** 0.012 -0.032***  -0.019** 0.003 0.002 0.007 
(-2.39) (0.98) (-3.99) (-2.64) (0.48) (0.34) (1.00) 
-0.019 0.004 -0.010 -0.006 0.001 0.001 0.002 

Y Y N N N N N 

N Y Y Y Y Y N 

N N Y N N N N 

N N N Y Y Y Y 

N N N Y Y N N 

N N N N N Y N 

N N N N Y N N 

N N N N N Y N 

N N N N N N Y 
1850 1850 1850 1980 1980 1980 1980 

3,883,186 3,883,186 3,061,482 3,061,468 3,061,468 3,061,468 3,061,468 

0.019 0.038 0.127 0.136 0.137 0.138 0.141 
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Table 9 — Effects of Interracial Marriage by Race and Sex 


This examines whether the effects of racial diversity on the number of children are impacted by measures of interracial marriage. In this table, we consider both 
men and women, ages 18-40, using the same survey data from 1850 to 2021, and take as the dependent variable the number of children assigned to that person. In 
columns 1-3, we interact the race share measure with a measure of abnormal levels of interracial marriage for that racial group and year. This is done by taking the 
set of all men and women aged 18-50 in that survey year, and computing the number from that race who are currently married to someone of a different race (using 
our previous definitions of race). Next, we randomize the races of all men and women in that sample who are currently married, and compute the number of 
interracial marriages we have under this random pairing. We compute 1000 such simulations, and use these to create a mean random rate of interracial marriage, 
and a standard deviation. The anormal interracial marriage measure is the actual rate minus the randomized mean, divided by the randomized standard deviation. 
This is interacted with race share, and included separately in column 1 (whereas in all other columns, the base variable is absorbed by the race-by-year fixed effect). 
In columns 5-8, we consider sex differences in the interracial marriage rate. For each race and year, we compute the number of women from that race who are 
married to someone of a different race, divided by the number of men from that race who are married to someone of a different race. This intermarriage sex ratio 
is then interacted with race share, and race share interacted with a dummy for the person being male. The other interaction terms (intermarriage sex ratio, where it 
is not omitted, and intermarriage sex ratio interacted with a male dummy) are included in the regression, but not reported. Coefficients are in the top row, and t- 
statistics are below in parentheses, with *, ** and *** indicating statistical significance at the 10%, 5% and 1% level respectively. 


Race Share 0.026 0.016 -0.021 0.121*** 0.309** 0.304** 0.268*** | 0.206*** 
(0.50) (0.30) (-0.47) (2.97) (2.82) (2.74) (3.77) (3.30) 
Race Share * Abnormal -0.140*** — -0.144*** — -0.103*** -0.021 
Intermarriage Rate — (-5.34) (-5.42) (-4.33) (-0.94) 
Race Share * InterMarriage Sex -0.180 -0.119 -0.215** -0.135 
Ratio (-1.38) (-0.87) (-2.33) (-1.50) 
Race Share * InterMarriage Sex 0.235*** 0.115*** 0.127*** 0.141*** 
Ratio * Male (5.78) (4.77) (5.61) (5.77) 
Sex*State-Y ear FE Y Y Y N Y Y Y N 
Demographics*(State, Year) FE Y N N N Y N N N 
Sex*Demographics*(State, Year) FE N Y Y Y N Y Y Y 
Area Type * Population FE *Year N N Y N N N Y N 
Sex* Area Traits * (State, Year) FE N N Y N N N Y N 
Sex * Area * Year FE N N N Y N N N Y 
Observations 11,853,697 11,853,691 11,853,691 11,853,691 11,853,697 11,853,691 11,853,691 11,853,691 
R-squared 0.408 0.425 0.427 0.429 0.409 0.425 0.427 0.430 
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Table 10 — Trust and Birth Rates 


This Table examines how local measures of trust affect the relationship between racial isolation and the number of children a woman has. In Panel A, we consider 
the generalized state-level trust measures in 2006, from Putnam (2007). These are compared to state-level race share measures (in columns 1-6) and baseline local 
area measures (in columns 7-12). The years examined are either all years, only the years between 2001 and 2010, or 2006 only. In Panel B, we consider the county- 
level measures of social capital from Chetty et al. (2022), namely the volunteering rate, friendship clustering, and economic connectedness. These are compared 
with county-only measures of race share. Coefficients are in the top row, and t-statistics are below in parentheses, with *, ** and *** indicating statistical 
significance at the 10%, 5% and 1% level respectively. 


Panel A - Social Capital Survey Direct Trust Measure 
Race Share (State) 0.358*** 0.282*** 0.353***  0267*** 0.328*** 0.243*** 
(7.42) (7.11) (6.50) (5.85) (5.81) (4.79) 


State Level Trust 0.432*** 0.465*** 0.449*** 0.326** 0.350*** 0.359*** 

(4.38) (4.77) (4.43) (2.77) (2.99) (3.03) 
Race Share (Base) 0.294*** 0.274*** 0.286***  0.264*** 0.283*** 0.262*** 
(9.61) (9.84) (7.85) (7.65) (8.78) (8.62) 

Years All All 2001-2010 2001-2010 2006 2006 All All 2001-2010 2001-2010 2006 2006 

Demographics *Y ear FE Y Y Y N Y Y Y N Y Y Y N 

Observations 9,179,897 9,179,897 3,154,144 3,154,144 414,474 414,474 5,967,596 5,914,400 1,956,940 1,945,083 322,868 321,148 

R-squared 0.384 0.384 0.369 0.370 0.371 0.372 0.393 0.394 0.381 0.382 0.377 0.378 
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Table 11 — Similarity In Other Variables and Number of Children 


This table examines how diversity in other demographic variables is associated with different numbers of children. For each demographic variable, we take as the 
independent variable the fraction of residents in the local area who share the same value of the trait as the woman. Local area is taken as county, then city if county 
is missing, then detailed metro area if both city and county are missing. Panel A examines education, income and age. Education is a dummy variable for the highest 
level of schooling (e.g. high school, college, graduate degree). Income is deciles of income across the US in the year in question. Age is the fraction of the population 
that is between two years younger and ten years older than the woman. Panel B examines country of birth and citizenship. Country of birth is a dummy for whether 
the person was born in the US, and Citizenship is a dummy for whether the person is a US citizen. Panel C includes these variables together, and also computes the 
marginal effect of a one-standard deviation unconditional change in each of the variables for each specification. All other control variables are defined in Table 2. 
Coefficients are in the top row, and t-statistics are below in parentheses, with *, ** and *** indicating statistical significance at the 10%, 5% and 1% level 
respectively. 


Panel A - Education, Income, Age 
Education Share 0.102* 0.176** -0.010 -0.011 
(1.97) (2.72) (-0.30)  (-0.33) 


Income Decile Share 0.659*** Q.789*** 0.733*** 0.740*** 
(7.57) (10.06) (10.73) (10.94) 
Age (-2,+10) Share -0.771*** -0.513*** 0.475*** 0.504*** 
(-4.98)  (-5.09) (3.33) (3.50) 
State-Y ear FE Y Y Y N Y Y Y N Y Y Y N 
Demographics FE Y N N N Y N N N Y N N N 
Demographics *(State, 
Year) FE N Y Y Y N Y Y Y N Y Y Y 
Area Type * Population 
FE *Year N N Y N N N Y N N N Y N 
Area Traits * Year N N Y N N N Y N N N Y N 
Area * Year FE N N N Y N N N Y N N N Y 


Observations 5,967,596 5,967,585 5,967,585 5,967,585 5,967,596 5,967,585 5,967,585 5,967,585 5,967,596 5,967,585 5,967,585 5,967,585 
R-squared 0.391 0.399 0.404 0.405 0.392 0.400 0.404 0.405 0.392 0.399 0.404 0.405 
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Panel B - Country of Birth, Citizenship 

US Born Share 0.152*** 0.178*** -0.024 -0.025 

(4.20) (4.35) (-1.17) (1.21) 
Citizenship Share 0.354*** 0.405*** — 0.047 0.046 
(10.991) — (7.73) (1.25) (1.21) 
State-Y ear FE 
Demographics FE 
Demographics*(State, Year) FE 
Area Type * Population FE *Y ear 
Area Traits * (State, Year) FE 
Area * Year FE 
Observations 5,967,596 5,967,585 5,967,585 5,967,585 5,967,596 5,967,585 5,967,585 5,967,585 

R-squared 0.391 0.399 0.404 0.405 0.392 0.400 0.404 0.405 
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Panel C - Other Variables in Combination 


Coefficients Effect of 1 s.d. unconditional change 

Race Share 0.194*** 0.226*** 0.188*** 0.191 *** 0.063 0.073 0.061 0.062 
(9.98) (7.30) (5.96) (5.99) 

Education Share 0.007 0.013 -0.090*** -0.092*** 0.001 0.002 -0.013 -0.013 
(0.17) (0.29) (-3.09)  (-3.14) 

Income Decile Share 0.609*** 0.678*** 0.640*** 0.645*** 0.022 0.024 0.023 0.023 
(8.20 (10.78) (10.44) (10.73) 

Age (within 10 years) Share -0.605*** -0.296** 0.402*** 0.429*** -0.032 -0.016 0.021 0.023 
(-3.39)  (-2.84) (2.85) (3.01) 

US Born Share -0.401*** -0.345*** -0.166*** -0.167*** -0.100 -0.086 -0.041 -0.042 
(-8.48)  (-5.61)  (-5.60)  (-5.34) 

Citizenship Share 0.677*** 0.625*** 0.175*** 0.177*** 0.210 0.194 0.054 0.055 


(11.35) (10.54) (4.13) (3.86) 
State-Y ear FE 
Demographics FE 
Demographics" (State, Year) FE 
Area Type * Population FE *Year 
AreaTraits * (State, Year) FE 
Area * Year FE 
Observations 5,967,596 5,967,585 5,967,585 5,967,585 

R-squared 0.394 0.401 0.405 0.406 
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Table 12 — Time Series Effects of Diversity on Fertility 


This Table examines how time series changes in the average local level of diversity (measured across the US) are associated with changes in US birth rates. The 
dependent variable is the average across all respondents of race share, either measured using combined geography (i.e. county, then city if county is unavailable, 
then detailed metro area if both city and county are unavailable), city only, county only, or state. In Panel A, the dependent variable is the total fertility rate in the 
year after the diversity measure, taken from the St Louis Fed FRED database. Additional controls are included for the level of inflation, unemployment, and GDP 
growth. The first five columns use OLS regressions with data back to 1971. The last five use Newey-West regressions with five lags, and data from 2006. “Full 
Sample Change” is the change in the independent variable (i.e. TFR) over the period in question. “Predicted Change” is the regression coefficient multiplied by the 
change in the independent variable from the first sample year to the last. “Fraction of Change Explained” is the ratio of these numbers. In Panel B, the dependent 
variable is the unadjusted average number of children for all women in the survey year. 


Panel A - Total Fertility Rate and Economic Controls 


Race Share (Base Combined) 1.440*** 2.569*** 3.969*** 3950s ** 
(3.86) (5.10) (10.29) (9.11) 
Race Share (City Only) 2.470 7.902*** 
(1.50) (10.92) 
Race Share (County Only) 2.090** 6.808*** 
(2.73) (12.50) 

Race Share (State) 2.570*** 4.433 *** 

(4.69) (8.49) 

Inflation -0.042** -0.020 -0.024  -0.043** 0.001 0.008 0.010 0.002 

(-2.70) (-0.80) (-1.10) (-2.56) (0.03) (0.55) (0.70) (0.13) 

GDP Growth -0.011 -0.011 -0.008 -0.005 0.005 0.008 0.019** 0.012 

(-0.74) (-0.54) (-0.44) (-0.30) (0.55) (0.74) (2.86) (1.39) 
Unemployment 0.329 0.152 0.094 0.748 0.598* 0.479 0.203 0.902** 

(0.22) (0.07) (0.04) (0.48) (1.93) (1.11) (0.51) (2.90) 

First Year 1971 1971 1981 1971 1971 2006 2006 2006 2006 2006 

Method OLS OLS OLS OLS OLS NW NW NW NW NW 


Full Sample Change — 0.602 0.602 0.148 0.602 0.602 0.444 0.444 0.444 0.444 0.444 

Predicted Change 0.393 0.702 0.364 0.513 0.704 0.426 0.421 0.462 0.512 0.424 

Fraction of Change Explained 0.653 1.166 2.458 0.852 1.170 0.960 0.948 1.040 1.153 0.954 
Observations 21 21 19 20 21 16 16 16 16 16 

R-squared 0.439 0.643 0.159 0.381 0.605 0.886 0.892 0.865 0.924 0.880 
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Race Share (Base Combined) 0.853*** 


Race Share (City Only) 
Race Share (County Only) 
Race Share (State) 


First Year 

Years Included 

Method 

Full Sample Change 
Predicted Change 

Fraction of Change Explained 
Observations 

R-squared 


(6.51) 


1850 
All 
OLS 
0.496 
0.426 
0.858 
32 
0.585 


Panel B - Number of Children, Long Sample 


0.630*** 
(6.25) 


1850 
All 
OLS 
0.496 
0.349 
0.704 
30 
0.583 


2.021*** 
(8.84) 


1950 
All 
OLS 
0.018 
0.692 
38.916 
22 
0.796 


0.791 *** 
(6.31) 
1850 
All 
OLS 
0.496 
0.415 
0.836 
32 
0.571 


0.644** 
(2.32) 


1850 
Decades 

OLS 

0.517 
0.314 
0.607 

16 
0.278 


0.478** 
(2.62) 


1850 


Decades 


OLS 
0.517 
0.264 
0.511 
15 
0.346 


1.767** 
(3.62) 
0.597** 
(2.30) 
1850 1850 
Decades Decades 
OLS OLS 
0.039 0.517 
0.583 0.304 
15.053 0.588 
7 16 
0.724 0.274 


61 


Data Appendix 


U.S. Data 

U.S. Census data sources are taken from the IPUMS default samples for each year, namely: 
1% sample from 1850, 1860, 1870, 1880, 1900, 1910, 1920, 1930, 1940, 1950 and 1960 
1% metro fm1 sample from 1970 

1% metro sample from 1980 and 1990 

1% sample from 2000 

10% sample from 2010 


ACS surveys from 2000-2021 


International Data 


Data for the number of samples, observations, control variables and racial classifications for each 


of the countries are listed below. 
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Country 
Brazil 


Colombia 


Costa Rica 


Cuba 


Ecuador 


El Salvador 


Jamaica 


# Course # Fine 
Mean # Geography Geography Control 
# Obs. Children # Samples Units Units Variables 
22,877,029 1.646 6 25 2,040 Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


2,215,042 1.542 4 22 438 Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


231,878 1.499 4 7 55 Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


383,755 0.970 2 14 137 Age, Education, 
Employment, 
Marital Status, 
Race 


909,119 1.601 5 14 79 Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


201,637 1.499 2 14 103 Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


121,582 1.404 3 14 N/A Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


Races 
White 
Black 

Indigenous 

Asian 

Brown 


White 
Black 
Indigenous 
Other 


White 
Black 
Indigenous 
Asian 
Chinese 
Mulatto 
Other 


White 
Black 
Mixed Race 


White 

Black 
Afro-Ecuadorian 
Indigenous 
Mestizo 

Mulatto 

Other 

Montubio 


White 
Black 
Indigenous 
Mestizo 
Other 


White 
Black 
Chinese 
Indian 
Other Asian 
Mixed Race 
Other 


Number 
10,232,595 
1,202,607 
40,650 
126,174 
7,028,211 


561,681 
74,113 
41,884 

889 


138,357 
2,232 
1,108 

156 
154 
6,015 
755 


242,150 
35,451 
106,154 


39,427 
7,513 
12,233 
30,938 
370,966 
11,909 
1,753 
17,969 


14,437 
117 
261 

92,930 
623 


229 
103,095 
213 
1,559 
13 
11,304 
98 


Pct 
54.92 
6.46 
0.22 
0.68 
37.72 


82.77 
10.92 
6.17 
0.13 
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Country 
Mozambique 


South A frica 


United Kingdom 


Uruguay 


Zimbabwe 


# Obs. 
651,821 


3,141,423 


92,397 


282,446 


123,039 


Mean # 


Children # Samples 


1.932 


1.042 


1.020 


1.229 


1.400 


2 


5 


1 


6 


1 


# Course 


# Fine 


Geography Geography Control 


Units 
11 


11 


19 


10 


Units 
143 


19 


N/A 


67 


88 


Variables 

Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


Age, 
Employment, 
Marital Status, 
Race 


Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


Age, Education, 
Employment, 
Marital Status, 
Race, Urban 


Races 
White 
Black 
Indian 

Pakistani 
Mixed Race 
Other 


White 

Black african 
Asian 
Coloured 
Other 


White 

Black African 
Black Caribbean 
Other Black 
Chinese 

Indian 

Pakistani 
Bangladeshi 
Other Asian 
Other 


White 

Black 
Indigenous 
Asian 
Mestizo 

Two or More 
Races 

Other 


White 
Black 
Asian 
Mixed Race 
Other 


Number 
457 
633,924 
489 

57 

3,037 
135 


155,624 
2,014,603 
53,757 
208,378 
3,288 


86,347 
499 
1,084 
350 
388 
1,673 
820 
197 
504 
535 


80,648 
3,433 
1,529 

183 
917 


3,940 
82 


145 
122,537 
84 

170 

8 


Pet 
0.07 
99.35 
0.08 
0.01 
0.48 
0.02 


6.39 
82.71 
2.21 
8.56 
0.13 


93.45 
0.54 
1.47 
0.38 
0.42 
1.81 
0.89 
0.21 
0.55 
0.58 


88.89 
3.78 
1.69 

0.2 
1.01 


4.34 
0.09 


0.12 
99.67 
0.07 
0.14 
0.01 
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